Javascript disabled? Like other modern websites, the IETF Datatracker relies on Javascript. Please enable Javascript for full functionality.

Zstandard Compression and the application/zstd Media Type
draft-kucherawy-dispatch-zstd-03

Revision differences

From revision

To revision

Diff format

Document history

RFC 8478
draft-kucherawy-dispatch-zstd

Date	Rev.	By	Action
2018-10-02	03	(System)	RFC Editor state changed to AUTH48-DONE from AUTH48
2018-09-19	03	(System)	RFC Editor state changed to AUTH48 from RFC-EDITOR
2018-09-07	03	(System)	RFC Editor state changed to RFC-EDITOR from EDIT
2018-07-17	03	(System)	IANA Action state changed to RFC-Ed-Ack from Waiting on RFC Editor
2018-07-17	03	(System)	IANA Action state changed to Waiting on RFC Editor from In Progress
2018-07-17	03	(System)	IANA Action state changed to In Progress from Waiting on Authors
2018-07-16	03	(System)	RFC Editor state changed to EDIT
2018-07-16	03	(System)	IESG state changed to RFC Ed Queue from Approved-announcement sent
2018-07-16	03	(System)	Announcement was received by RFC Editor
2018-07-16	03	(System)	IANA Action state changed to Waiting on Authors from In Progress
2018-07-16	03	(System)	IANA Action state changed to In Progress
2018-07-16	03	Cindy Morgan	IESG state changed to Approved-announcement sent from Approved-announcement to be sent
2018-07-16	03	Cindy Morgan	IESG has approved the document
2018-07-16	03	Cindy Morgan	Closed "Approve" ballot
2018-07-16	03	Cindy Morgan	Ballot approval text was generated
2018-07-16	03	Alexey Melnikov	IESG state changed to Approved-announcement to be sent from IESG Evaluation::AD Followup
2018-07-15	03	Murray Kucherawy	New version available: draft-kucherawy-dispatch-zstd-03.txt
2018-07-15	03	(System)	New version approved
2018-07-15	03	(System)	Request for posting confirmation emailed to previous authors: Yann Collet , Murray Kucherawy
2018-07-15	03	Murray Kucherawy	Uploaded new revision
2018-07-13	02	Alexey Melnikov	[Ballot comment] Checking a few minor points with authors before approval.
2018-07-13	02	Alexey Melnikov	Ballot comment text updated for Alexey Melnikov
2018-07-12	02	Benjamin Kaduk	[Ballot comment] Thanks for addressing my DISCUSS (and pointing out that my other point was not grounded in fact)! I think I had intended to … [Ballot comment] Thanks for addressing my DISCUSS (and pointing out that my other point was not grounded in fact)! I think I had intended to switch to Yes, but since I was so slow to respond to the author comments, I have forgotten enough about the document that I will just go with No Objection instead, to avoid any further delay. Original comments preserved below: Some high-level comments: After reading Section 2.4.2.1 (and subsections), I'm not sure I fully understand the procedures for the Huffman coding. Granted, in order to obtain the efficient compression ratios the procedure is necessarily complicated, so arguably there is some onus on me as the reader to work harder to understand. Having said that, I think the mapping from weights to prefixes in the prefix code makes sense to me, but I'm not sure I understand how the weights are assigned to their respective symbols/literals (i.e., what the prefix decodes to). My current reading of the text is that weights are given for literals 0 through (last literal - 1), and in particular if I want to do raw (non-FSE) weights, I can only do 128 literals like this. So, are the symbols/literals just these values from 0 to 127? That seems unlikely to be correct, given that we want to compress data containing bytes with the high bit set, but I'm unsure where I'm going astray. It seems like a link to some dedicated/official test vectors seems like it would be useful, if test vectors themselves are seen as being too bulky to include in the document. The text about third-party dictionaries in Section 4 makes me wonder if it should explicitly be stated that "dictionary input to decompression should be treated as being similarly untrusted as input compressed frames, and the appropriate bounds checking performed on data accesses as needed". In Appendix B, I don't see a reference to match up to a chapter "from normalized distribution to decoding tables". Some more minor section-by-section comments follow. Section 2.1.1 Should the frame-structure diagram list "0 or 4 bytes" for the content checksum? Content_Checksum: An optional 32-bit checksum, only present if the Content_Checksum_flag is set. The content checksum is the result of the xxh64() hash function [XXHASH] digesting the origina "original" Section 2.1.1.1.1.2 In this case, Window_Descriptor byte is skipped, but "the Window_Descriptor byte" What's the difference between "Unused" and "Reserved", viz. 2.1.1.1.1.3 and 2.1.1.1.1.4? Section 2.1.1.1.1.6. This is a two-bit flag (= FHD & 3) two-bit flag (whose value is obtained by masking the frame header descriptor with 0x3) Section 2.1.1.1.2. Provides guarantees on minimum memory buffer required to decompress a frame. [...] "the minimum memory buffer" Section 2.1.1.1.3. Field size depends on Dictionary_ID_flag. [...] "The field size depends on the" How are dictionary (ID)s allocated and standardized? Section 2.1.1.2.2. RLE_Block: This is a single byte, repeated Block_Size times. Block_Content consists of a single byte. On the decompression side, this byte must be repeated Block_Size times. I think we need normative language on "the decoder MUST verify [...]". Section 2.1.1.3 This "all previously decoded data" is only within a frame, right? It might be worth reiterating that here. Section 2.1.1.3.1.1. For Size_Format for Raw_Literals_Block and RLE_Literals_Block, I think we need to explicitly say "when parsing the flags bit-by-bit, if the low-order bit of the Size_Format field is zero, the field is only one bit, and processing proceeds to the next bitfield. Regenerated_Size uses [...]" Section 2.1.1.3.2.1. Please expand FSE on first use. Section 2.4.1.1 Finally, the decoder can tell how many bytes were used in this process, and how many symbols are present. The bitstream consumes a round number of bytes. Any remaining bit within the last byte is simply unused. Usually we say "set to zero on encode, ignore on decode" to make things deterministic. Section 2.4.2 When decompressing, the last byte containing the padding is the first byte to read. The decompressor needs to skip 0-7 initial 0-bits and the first 1-bit lt occurs. Afterwards, the useful part of the bitstream begins. I guess "lt" is supposed to be "that"? Section 2.4.2.1 The tree depth is 4, since its smallest element uses 4 bits. It's not entirely clear to me what "smallest" means here -- presumably "lowest in the tree", though that is rather tautological in this context. "Smallest weight" would make sense, except there is not a weight column in the table. Section 2.4.2.1.3 if Number_of_Bits != 0 Number_of_Bits = Max_Number_of_Bits + 1 - Weight Should the first Number_of_Bits be Weight?
2018-07-12	02	Benjamin Kaduk	[Ballot Position Update] Position for Benjamin Kaduk has been changed to No Objection from Discuss
2018-07-11	02	Adam Roach	[Ballot comment] Thanks for addressing my discuss and comments.
2018-07-11	02	Adam Roach	[Ballot Position Update] Position for Adam Roach has been changed to No Objection from Discuss
2018-07-11	02	(System)	Sub state has been changed to AD Followup from Revised ID Needed
2018-07-11	02	(System)	IANA Review state changed to Version Changed - Review Needed from IANA OK - Actions Needed
2018-07-11	02	Cindy Morgan	New version available: draft-kucherawy-dispatch-zstd-02.txt
2018-07-11	02	(System)	Secretariat manually posting. Approvals already received
2018-07-11	02	Cindy Morgan	Uploaded new revision
2018-05-24	01	Cindy Morgan	IESG state changed to IESG Evaluation::Revised I-D Needed from IESG Evaluation
2018-05-24	01	Ignas Bagdonas	[Ballot comment] No objection as in "have read the document but it is not in the domain of expertise that I can authoritatively comment on".
2018-05-24	01	Ignas Bagdonas	[Ballot Position Update] New position, No Objection, has been recorded for Ignas Bagdonas
2018-05-24	01	Benjamin Kaduk	[Ballot discuss] I support Adam's DISCUSS. Additionally, I think that there are significant privacy considerations associated with the Skippable Frames described in Section 2.3, that … [Ballot discuss] I support Adam's DISCUSS. Additionally, I think that there are significant privacy considerations associated with the Skippable Frames described in Section 2.3, that should be documented before this document advances. Specifically, this provides an easy way for a party (not even necessarily the encoder, since these frames can be inserted independently from the actual compression scheme) to insert (e.g.) tracking data into a compressed stream and have it ignored by standard decoders. There are myriad possibilities for how this could be used, such as for watermarking files with information about how they were downloaded/generated/etc., which could be used for tracking leaks from confidential materials or illegal distribution of copyrighted content; there is potential for personally identifying information to be included; the list goes on. I can see that there can also be useful ways to use these frames to introduce additional metadata about the compressed content, but fear that we may want to give guidance for these frames to be stripped/forbidden/etc. absent additional context to indicate that the information in the skippable frame is non-malicious. A more minor note, but still IMO blocking -- in Section 2.1.1.1.2: windowLog = 10 + Exponent; windowBase = 1 << windowLog; windowAdd = (windowBase / 8) * Mantissa; Window_Size = windowBase + windowAdd; I don't think this formula is correct -- windowAdd in this formula is not modified by windowLog at all, which does not match up with the stated maxiumum bound in the body text.
2018-05-24	01	Benjamin Kaduk	[Ballot comment] Some high-level comments: After reading Section 2.4.2.1 (and subsections), I'm not sure I fully understand the procedures for the Huffman coding. Granted, in … [Ballot comment] Some high-level comments: After reading Section 2.4.2.1 (and subsections), I'm not sure I fully understand the procedures for the Huffman coding. Granted, in order to obtain the efficient compression ratios the procedure is necessarily complicated, so arguably there is some onus on me as the reader to work harder to understand. Having said that, I think the mapping from weights to prefixes in the prefix code makes sense to me, but I'm not sure I understand how the weights are assigned to their respective symbols/literals (i.e., what the prefix decodes to). My current reading of the text is that weights are given for literals 0 through (last literal - 1), and in particular if I want to do raw (non-FSE) weights, I can only do 128 literals like this. So, are the symbols/literals just these values from 0 to 127? That seems unlikely to be correct, given that we want to compress data containing bytes with the high bit set, but I'm unsure where I'm going astray. It seems like a link to some dedicated/official test vectors seems like it would be useful, if test vectors themselves are seen as being too bulky to include in the document. The text about third-party dictionaries in Section 4 makes me wonder if it should explicitly be stated that "dictionary input to decompression should be treated as being similarly untrusted as input compressed frames, and the appropriate bounds checking performed on data accesses as needed". In Appendix B, I don't see a reference to match up to a chapter "from normalized distribution to decoding tables". Some more minor section-by-section comments follow. Section 2.1.1 Should the frame-structure diagram list "0 or 4 bytes" for the content checksum? Content_Checksum: An optional 32-bit checksum, only present if the Content_Checksum_flag is set. The content checksum is the result of the xxh64() hash function [XXHASH] digesting the origina "original" Section 2.1.1.1.1.2 In this case, Window_Descriptor byte is skipped, but "the Window_Descriptor byte" What's the difference between "Unused" and "Reserved", viz. 2.1.1.1.1.3 and 2.1.1.1.1.4? Section 2.1.1.1.1.6. This is a two-bit flag (= FHD & 3) two-bit flag (whose value is obtained by masking the frame header descriptor with 0x3) Section 2.1.1.1.2. Provides guarantees on minimum memory buffer required to decompress a frame. [...] "the minimum memory buffer" Section 2.1.1.1.3. Field size depends on Dictionary_ID_flag. [...] "The field size depends on the" How are dictionary (ID)s allocated and standardized? Section 2.1.1.2.2. RLE_Block: This is a single byte, repeated Block_Size times. Block_Content consists of a single byte. On the decompression side, this byte must be repeated Block_Size times. I think we need normative language on "the decoder MUST verify [...]". Section 2.1.1.3 This "all previously decoded data" is only within a frame, right? It might be worth reiterating that here. Section 2.1.1.3.1.1. For Size_Format for Raw_Literals_Block and RLE_Literals_Block, I think we need to explicitly say "when parsing the flags bit-by-bit, if the low-order bit of the Size_Format field is zero, the field is only one bit, and processing proceeds to the next bitfield. Regenerated_Size uses [...]" Section 2.1.1.3.2.1. Please expand FSE on first use. Section 2.4.1.1 Finally, the decoder can tell how many bytes were used in this process, and how many symbols are present. The bitstream consumes a round number of bytes. Any remaining bit within the last byte is simply unused. Usually we say "set to zero on encode, ignore on decode" to make things deterministic. Section 2.4.2 When decompressing, the last byte containing the padding is the first byte to read. The decompressor needs to skip 0-7 initial 0-bits and the first 1-bit lt occurs. Afterwards, the useful part of the bitstream begins. I guess "lt" is supposed to be "that"? Section 2.4.2.1 The tree depth is 4, since its smallest element uses 4 bits. It's not entirely clear to me what "smallest" means here -- presumably "lowest in the tree", though that is rather tautological in this context. "Smallest weight" would make sense, except there is not a weight column in the table. Section 2.4.2.1.3 if Number_of_Bits != 0 Number_of_Bits = Max_Number_of_Bits + 1 - Weight Should the first Number_of_Bits be Weight?
2018-05-24	01	Benjamin Kaduk	[Ballot Position Update] New position, Discuss, has been recorded for Benjamin Kaduk
2018-05-24	01	Alvaro Retana	[Ballot comment] I share Adam's concern and support his DISCUSS. FWIW, I would also be happy with a clarification along the lines of what Adam … [Ballot comment] I share Adam's concern and support his DISCUSS. FWIW, I would also be happy with a clarification along the lines of what Adam suggested.
2018-05-24	01	Alvaro Retana	Ballot comment text updated for Alvaro Retana
2018-05-24	01	Martin Vigoureux	[Ballot Position Update] New position, No Objection, has been recorded for Martin Vigoureux
2018-05-23	01	Terry Manderson	[Ballot Position Update] New position, No Objection, has been recorded for Terry Manderson
2018-05-23	01	Ben Campbell	[Ballot Position Update] Position for Ben Campbell has been changed to No Objection from No Record
2018-05-23	01	Ben Campbell	[Ballot comment] I agree with Adam's discuss points, although it was not clear to me if public dictionaries are expected. Others have already pointed out … [Ballot comment] I agree with Adam's discuss points, although it was not clear to me if public dictionaries are expected. Others have already pointed out the document says "standards track". Otherwise I have a few mostly editorial comments: §2.1 - s/indepedently/independently §2.1.1, last paragraph - s/origina/original §2.1.1.1.1.3 "The value of this bit should be set to zero"- I know this isn't using 2119 language--but is the plain-English meaning of "should" what you had in mind? Also, why is this stated differently from §2.1.1.1.1.4? (Also also, what's the IETF record for section nesting depth?) §2.1.1.2.2, reserved - The text says "this value cannot be used with the current specification". What if you get a block that used it? §2.1.1.2.3 : "always strictly less than" - Is this really true? There's no possible input (e.g. already compressed text) where the decompressed and compressed sizes are equal? §4: - "Usual precautions"- does that refer to the subsequent paragraphs, or something else? - Any chance of a citation for "fuzz-test"? - Last paragraph, last sentence: Forward looking predictions in RFCs have a habit of becoming dated, one way or another. §6.2 - Are none of these references properly normative?
2018-05-23	01	Ben Campbell	Ballot comment text updated for Ben Campbell
2018-05-23	01	Alvaro Retana	[Ballot comment] I share Adam's concern and support his DISCUSS. FWIW, I would also be happy with a clarification along the lines of what Adan … [Ballot comment] I share Adam's concern and support his DISCUSS. FWIW, I would also be happy with a clarification along the lines of what Adan suggested.
2018-05-23	01	Alvaro Retana	[Ballot Position Update] New position, No Objection, has been recorded for Alvaro Retana
2018-05-23	01	Alissa Cooper	[Ballot Position Update] New position, No Objection, has been recorded for Alissa Cooper
2018-05-22	01	Suresh Krishnan	[Ballot Position Update] New position, No Objection, has been recorded for Suresh Krishnan
2018-05-22	01	Adam Roach	[Ballot discuss] Thanks for taking the time to document this format for public consumption. I have a handful of blocking concerns (although I'm open to … [Ballot discuss] Thanks for taking the time to document this format for public consumption. I have a handful of blocking concerns (although I'm open to listening to reasons that I might be wrong on this front), and a number of additional comments. --------------------------------------------------------------------------- I have a lot of heartburn around the publication of an in informational document of a protocol called "Zstandard." I know the protocol has been in development for a while, and has non-trivial deployment, so I understand that there would be reluctance to change its name at this point. If we leave the name as-is, I do not think that the normal informational boilerplate is sufficient. I would like to see additional text that explicitly addresses the situation, along the lines of: [Abstract] Zstandard, or "zstd" (pronounced "zee standard"), is a data compression mechanism. This document describes the mechanism, and registers a media type to be used when transporting zstd-compressed via Multipurpose Internet Mail Extensions (MIME). Despite the use of the word "standard" as part of its name, readers are advised that this document is not an Internet Standards Track specification, and is being published for informational purposes only. [Introduction] Zstandard, or "zstd" (pronounced "zee standard") is a data compression mechanism, akin to gzip [RFC1952]. Despite the use of the word "standard" as part of its name, readers are advised that this document is not an Internet Standards Track specification, and is being published for informational purposes only. --------------------------------------------------------------------------- §2.2.1: > For the first block, the starting offset history is populated with > the following values : 1, 4 and 8 (in order). I fear this is ambiguously specified. I can interpret this as either temporal order: Repeated_Offset1 = 8 Repeated_Offset2 = 4 Repeated_Offset3 = 1 Or as sequential order: Repeated_Offset1 = 1 Repeated_Offset2 = 4 Repeated_Offset3 = 8 Please clarify, as this confusion can lead to incompatible implementations. --------------------------------------------------------------------------- The dictionary scheme in here seems problematic, in that the intention is clearly to have public, well-known dictionaries; and the dictionaries are intended to have globally-unique identifiers for that purpose. 31 bits isn't enough space to achieve uniqueness through randomness. While there are other approaches that involve things like dictionary IDs that are hashes of their contents (see, e.g., SigComp), I suspect the notion of expanding the size of this field isn't very appealing. If you keep the format the same (4 bytes), I don't see how the dictionary part of this scheme can be interoperable without a registry of some kind. Even if the intention is to publish further documents on the topic of dictionaries, I believe publication of this document needs to wait on establishment of such a registry. I have no opinion about whether this is resolved by creating the registry in this document, or in holding its publication until the document that does create such a registry is published.
2018-05-22	01	Adam Roach	[Ballot comment] General: The document uses the phrase "natural order" in several places without defining it. I can make a guess about what is intended, … [Ballot comment] General: The document uses the phrase "natural order" in several places without defining it. I can make a guess about what is intended, but I'm not completely confident. Adding a definition for this term would be very helpful. --------------------------------------------------------------------------- §2.1.1: > of the xxh64() hash function [XXHASH] digesting the origina Typo: "original" --------------------------------------------------------------------------- §2.1.1.1.1.1. > This is a two-bit flag (equivalent to Frame_Header_Descriptor left- > shifted six bits) Shouldn't this say "right-shifted"? --------------------------------------------------------------------------- §2.1.1.3: > To decode a compressed block, the following elements are necessary: > > o Previous decoded data, up to a distance of Window_Size, or all > previously decoded data when Single_Segment_flag is set. To be clear, this is "up to a distance of Window_Size or to the beginning of the Frame, whichever is smaller," right? I believe the intention is that you can't use data from the previous frame to encode this one, and the text should probably take care to avoid any implication to the contrary. --------------------------------------------------------------------------- §2.1.1.3.1: > Literals can be stored uncompressed or compressed using Huffman > prefix codes. When compressed, an optional tree description can be > present, followed by one or four streams. A brief description right here of the concept of a "stream" would be quite helpful in understanding the following several sections. --------------------------------------------------------------------------- §2.1.1.3.1.1: > Value ?0: Size_Format uses one bit. Regenerated_Size uses five bits Please either define the meaning of "?" here, or explicitly call out "Values 00 and 10:" --------------------------------------------------------------------------- §2.1.1.3.2.1: > o if (byte0 < 255): Number_of_Sequences = ((byte0-128) << 8) + > byte1. Uses 2 bytes. Please change to "if (127 < byte0 < 255):" --------------------------------------------------------------------------- §2.1.1.3.2.1: > Predefined_Mode: A predefined FSE distribution table is used, > defined in Section 2.1.1.3.2.2. No distribution table will be > present. I see that "FSE" is expanded in section 2.4.1. Please expand it here, or provide a reference to 2.4.1 here. --------------------------------------------------------------------------- §2.1.1.3.2.1: The table of compression modes lists the modes in the order "Predefined, RLE, FSE_Compressed, and "Repeat," while the description of each mode reverses the final two. Consider changing these to be in the same order. --------------------------------------------------------------------------- §2.1.1.3.2.1: The description makes it clear that Repeat_Mode is valid following RLE_Mode. It's unclear whether it's allowed after Predefined_Mode (which would be well-defined, but somewhat silly to code -- then again, the format seems rather permissive, so I would guess it's allowed). For avoidance of doubt, please either explicitly allow or explicitly forbid this. --------------------------------------------------------------------------- §2.1.1.3.2.1: > The description of the codes for how > to determine these values was presented earlier. Perhaps a reference to where this was done is in order? --------------------------------------------------------------------------- §2.3: While it doesn't impact the compression, this scheme seems pretty iffy in terms of utility and future-proofness. I would expect to see some kind of minimal tagging system indicating what kind of metadata the frame contains, even if no such kinds are defined by this document (e.g., something simple like "The first byte of User-Data indicates the type of metadata contained by this frame", and then set up an empty IANA table for registering such bytes...) --------------------------------------------------------------------------- §2.4.1.1: > A bitstream is read forward, in little-endian fashion. It is not > necessary to know its exact size, since the size will be discovered > and reported by the decoding process. The bitstream starts by > reporting on which scale it operates. Note that Accuracy_Log = > low4bits + 5. I can't find where "low4bits" is defined. Is this meaning to say that the least significant 4 bits of the initial (that is, highest-in-memory) byte are used to encode the Accuracy_Log, with an offset of 5? --------------------------------------------------------------------------- §2.4.1.1: > Value decoded: Small values use one less bit. Nit: "...one fewer bit..." --------------------------------------------------------------------------- §2.4.1.1: > All remaining symbols are sorted in their natural order. Starting > from symbol 0 and table position 0, each symbol gets attributed as > many cells as its probability. Cell allocation is non-linear linear; If the use of the phrase "non-linear linear" is not an editorial error, please provide a definition of what is meant by this phrase. --------------------------------------------------------------------------- §2.4.2.1.1.: > The full representation occupies (Number_of_Symbols+1)/2 bytes, > meaning it uses a last full byte even if Number_of_Symbols is odd. Based on the phrase after the comma, I think you mean ceiling((Number_of_Symbols+1)/2). The formula you have implies the opposite. --------------------------------------------------------------------------- §2.4.2.1.2: > The number of symbols to decode is determined by tracking the > bitStream overflow condition: If updating state after decoding a > symbol would require more bits than remain in the stream, it is > assumed that extra bits are zero. Then, the symbols for each of the > I final states are decoded and the process is complete. I presume the "I" on the beginning of this final line is a typo? --------------------------------------------------------------------------- §2.4.2.1.3: > Symbols are sorted by Weight. Within same Weight, symbols keep > natural order. Symbols with a Weight of zero are removed. Then, > starting from lowest weight, prefix codes are distributed in order. In what order? --------------------------------------------------------------------------- §3.1: > Published specification: [ZSTD] Given that the type being registered is neither vendor tree nor personal tree, I'm pretty sure this needs to be an RFC (or submitted by a recognized SDO). Luckily, we're standing in an RFC-to-be right now, so I think you can fix this by simply pointing to [RFCXXXX]. > For further information: See [ZSTD] I think this should be [RFCXXXX] as well. --------------------------------------------------------------------------- §4: > A decoder has to demonstrate capabilities to detect and prevent any > kind of data tampering in the compressed frame from triggering system > faults, such as reading or writing beyond allowed memory ranges. > This can be guaranteed either by the implementation language, or by > careful bound checkings. It is highly recommended to fuzz-test > decoder implementations to test and harden their capability to detect > bad frames and deal with them without any system side-effect. I think it makes sense to specifically call out encoding of Number_of_Sequences values that cause the decoder to read into the block header (and beyond), as well as the indication of a Frame Content Size that is smaller than the actual uncompressed data, in an attempt to trigger buffer overflow. --------------------------------------------------------------------------- §6.2: > [XXHASH] "XXHASH Algorithm", 2017, . This needs to be normative: one cannot implement the full range of features of the zstd format without understanding it.
2018-05-22	01	Adam Roach	[Ballot Position Update] New position, Discuss, has been recorded for Adam Roach
2018-05-21	01	Deborah Brungard	[Ballot comment] Confused also on the track. Datatracker says intended status is Informational, but the document says Standards Track.
2018-05-21	01	Deborah Brungard	[Ballot Position Update] New position, No Objection, has been recorded for Deborah Brungard
2018-05-21	01	Alexey Melnikov	1. Summary Alexey Melnikov is the responsible Area Director. Zstandard, or "zstd" (pronounced "zee standard"), is a data compression mechanism. This document … 1. Summary Alexey Melnikov is the responsible Area Director. Zstandard, or "zstd" (pronounced "zee standard"), is a data compression mechanism. This document describes the mechanism, and registers a media type to be used when transporting zstd-compressed via Multipurpose Internet Mail Extensions (MIME), as well as a new HTTP Content Coding. 2. Review and Consensus This is not a product of an IETF WG. There are multiple implementations of "zstd" compression algorithm, see See also Section 5 of the draft. 3. Intellectual Property Editors confirmed that they have no IPR to disclose. 4. Other Points The document is incorrectly stating that it is Standards Track, however it was IETF Last Called as Informational. The document is Informational, so there are no DownRefs. IANA Considerations are clear.
2018-05-18	01	Spencer Dawkins	[Ballot comment] I'll let you folks chat with Mirja about the larger topic of requirements language, but I note that this formulation 2.1.1.1.1.3. Unused Bit … [Ballot comment] I'll let you folks chat with Mirja about the larger topic of requirements language, but I note that this formulation 2.1.1.1.1.3. Unused Bit The value of this bit should be set to zero. A decoder compliant with this specification version shall not interpret it. It might be used in a future version, to signal a property which is not mandatory to properly decode the frame. really doesn't protect that bit for future use. I don't care if it's "MUST be set to zero by encoders and ignored by decoders" or "is always set to zero in this version of the algorithm", but a weaker constraint doesn't prevent implementers from squatting on this bit now. You know, something like the next subsection: 2.1.1.1.1.4. Reserved Bit This bit is reserved for some future feature. Its value must be zero. A decoder compliant with this specification version must ensure it is not set. This bit may be used in a future revision, to signal a feature that must be interpreted to decode the frame correctly. This text For improved interoperability, decoders are recommended to be compatible with Window_Size >= 8 MB, and encoders are recommended to not request more than 8 MB. It's merely a recommendation though, and decoders are free to support larger or lower limits, depending on local limitations. is pretty clear about the motivation for limiting the Window_Size to 8 MB, and why an implementation might want to use a smaller Window_Size, but is there anything you could say about why an implementation might want to use a larger Window_Size value? In this text, 2.4. Entropy Encoding Two types of entropy encoding are used by the Zstandard format: FSE, and Huffman coding. could you give any guidance about why you might choose to use one format over another? Is the meaning of "under control of a third party" well understood? One should never compress together a message whose content must remain secret with a message under control of a third party. I might be able to guess at a precise definition, but I'd be guessing. I'm wondering if you really want to remove all of 5. Implementation Status [RFC EDITOR: Please remove this section prior to publication.] Source code for a C language implementation of a "Zstandard" compliant library is available at [ZSTD-GITHUB]. This implementation is production ready, implementing the full range of the specification. It is tested against security hazards, and widely deployed within Facebook infrastructure. given that this text 2.5. Dictionary Format (snip) However, dictionaries created by "zstd --train" in the reference implementation follow a specific format, described here. refers to what I'm assuming is the same reference implementation (but I can't be sure, because there's no reference pointer in the Section 2.5 usage). (I was on the IESG and balloted Yes for https://datatracker.ietf.org/doc/rfc7942/, so I understand that this says you delete Implementation Sections before publishing as an RFC, but I don't think pointers to a reference implementation fall into the same category as the typical "so far, X, Y, and Z have implemented this protocol" Implementation Sections that are instantly outdated. A pointer to a reference implementation sounds more useful for future readers. But, at a minimum, adding a reference pointer to the Section 2.5 occurrence would be useful, since that's the first time a reference implementation is mentioned)
2018-05-18	01	Spencer Dawkins	[Ballot Position Update] New position, No Objection, has been recorded for Spencer Dawkins
2018-05-18	01	Mirja Kühlewind	[Ballot comment] Unfortunately there is no shepherd write-up, therefore I have couple of questions/comments: 1) What's the reason that this document is submitted as "Standards … [Ballot comment] Unfortunately there is no shepherd write-up, therefore I have couple of questions/comments: 1) What's the reason that this document is submitted as "Standards Track"? In there a working group that is planning to use this mechanism? Or why does it need to be in the IETF/have IETF consensus? 2) I think this document would benefit from the use of normative language. 3) Why is there a normative reference to a website? I would think the document should be and is describing the compression mechanism comprehensively without the need to have a look at a website that might not even provide a stable reference. Sorry one more comment I forgot earlier: 4) Should the User_Data Frame be further discussed in the security section as it can carry arbitrary information which can be security-relevant or privacy sensitive...?
2018-05-18	01	Mirja Kühlewind	Ballot comment text updated for Mirja Kühlewind
2018-05-18	01	Mirja Kühlewind	[Ballot comment] Unfortunately there is no shepherd write-up, therefore I have couple of questions/comments: 1) What's the reason that this document is submitted as "Standards … [Ballot comment] Unfortunately there is no shepherd write-up, therefore I have couple of questions/comments: 1) What's the reason that this document is submitted as "Standards Track"? In there a working group that is planning to use this mechanism? Or why does it need to be in the IETF/have IETF consensus? 2) I think this document would benefit from the use of normative language. 3) Why is there a normative reference to a website? I would think the document should be and is describing the compression mechanism comprehensively without the need to have a look at a website that might not even provide a stable reference.
2018-05-18	01	Mirja Kühlewind	[Ballot Position Update] New position, No Objection, has been recorded for Mirja Kühlewind
2018-05-18	01	Alexey Melnikov	IESG state changed to IESG Evaluation from Waiting for Writeup
2018-05-18	01	Alexey Melnikov	Ballot has been issued
2018-05-18	01	Alexey Melnikov	[Ballot Position Update] New position, Yes, has been recorded for Alexey Melnikov
2018-05-18	01	Alexey Melnikov	Created "Approve" ballot
2018-05-18	01	Alexey Melnikov	Ballot writeup was changed
2018-04-26	01	Susan Hares	Request for Last Call review by OPSDIR Completed: Has Issues. Reviewer: Susan Hares. Sent review to list.
2018-04-23	01	Alexey Melnikov	Placed on agenda for telechat - 2018-05-24
2018-04-23	01	(System)	IESG state changed to Waiting for Writeup from In Last Call
2018-04-20	01	(System)	IANA Review state changed to IANA OK - Actions Needed from IANA - Review Needed
2018-04-20	01	Sabrina Tanamal	(Via drafts-lastcall@iana.org): IESG/Authors/WG Chairs: The IANA Services Operator has completed its review of draft-kucherawy-dispatch-zstd-01. If any part of this review is inaccurate, please let … (Via drafts-lastcall@iana.org): IESG/Authors/WG Chairs: The IANA Services Operator has completed its review of draft-kucherawy-dispatch-zstd-01. If any part of this review is inaccurate, please let us know. The IANA Services Operator understands that, upon approval of this document, there are two actions which we must complete. First, in the application registry on the Media Types registry page located at: https://www.iana.org/assignments/media-types/ a single, new media type will be registered as follows: Name: zstd Template: [ TBD-at-Registration ] Reference: [ RFC-to-be ] Second, in the HTTP Content Coding Registry on the Hypertext Transfer Protocol (HTTP) Parameters registry page located at: https://www.iana.org/assignments/http-parameters/ a single, new registration will be added as follows: Name: zstd Description: A stream of bytes compressed using the Zstandard protocol Reference: [ RFC-to-be ] The IANA Services Operator understands that these are the only actions required to be completed upon approval of this document. Note: The actions requested in this document will not be completed until the document has been approved for publication as an RFC. This message is meant only to confirm the list of actions that will be performed. Thank you, Sabrina Tanamal Senior IANA Services Specialist
2018-04-19	01	Tero Kivinen	Request for Last Call review by SECDIR Completed: Ready. Reviewer: Scott Kelly.
2018-04-19	01	Vijay Gurbani	Request for Last Call review by GENART Completed: Ready with Nits. Reviewer: Vijay Gurbani. Sent review to list.
2018-04-05	01	Jean Mahoney	Request for Last Call review by GENART is assigned to Vijay Gurbani
2018-04-05	01	Jean Mahoney	Request for Last Call review by GENART is assigned to Vijay Gurbani
2018-04-04	01	Gunter Van de Velde	Request for Last Call review by OPSDIR is assigned to Susan Hares
2018-04-04	01	Gunter Van de Velde	Request for Last Call review by OPSDIR is assigned to Susan Hares
2018-03-29	01	Tero Kivinen	Request for Last Call review by SECDIR is assigned to Scott Kelly
2018-03-29	01	Tero Kivinen	Request for Last Call review by SECDIR is assigned to Scott Kelly
2018-03-26	01	Amy Vezza	IANA Review state changed to IANA - Review Needed
2018-03-26	01	Amy Vezza	The following Last Call announcement was sent out (ends 2018-04-23): From: The IESG To: IETF-Announce CC: draft-kucherawy-dispatch-zstd@ietf.org, alexey.melnikov@isode.com Reply-To: ietf@ietf.org Sender: Subject: Last Call: … The following Last Call announcement was sent out (ends 2018-04-23): From: The IESG To: IETF-Announce CC: draft-kucherawy-dispatch-zstd@ietf.org, alexey.melnikov@isode.com Reply-To: ietf@ietf.org Sender: Subject: Last Call: (Zstandard Compression and The application/zstd Media Type) to Informational RFC The IESG has received a request from an individual submitter to consider the following document: - 'Zstandard Compression and The application/zstd Media Type' as Informational RFC The IESG plans to make a decision in the next few weeks, and solicits final comments on this action. Please send substantive comments to the ietf@ietf.org mailing lists by 2018-04-23. Exceptionally, comments may be sent to iesg@ietf.org instead. In either case, please retain the beginning of the Subject line to allow automated sorting. Abstract Zstandard, or "zstd" (pronounced "zee standard"), is a data compression mechanism. This document describes the mechanism, and registers a media type to be used when transporting zstd-compressed via Multipurpose Internet Mail Extensions (MIME). The file can be obtained via https://datatracker.ietf.org/doc/draft-kucherawy-dispatch-zstd/ IESG discussion can be tracked via https://datatracker.ietf.org/doc/draft-kucherawy-dispatch-zstd/ballot/ No IPR declarations have been submitted directly on this I-D.
2018-03-26	01	Amy Vezza	IESG state changed to In Last Call from Last Call Requested
2018-03-26	01	Amy Vezza	Last call announcement was changed
2018-03-23	01	Alexey Melnikov	Last call was requested
2018-03-23	01	Alexey Melnikov	Last call announcement was generated
2018-03-23	01	Alexey Melnikov	Ballot approval text was generated
2018-03-23	01	Alexey Melnikov	Ballot writeup was generated
2018-03-23	01	Alexey Melnikov	IESG state changed to Last Call Requested from AD Evaluation
2018-03-23	01	Alexey Melnikov	Changed consensus to Yes from Unknown
2018-03-23	01	Alexey Melnikov	IESG state changed to AD Evaluation from Publication Requested
2018-03-18	01	Alexey Melnikov	IETF WG state changed to Submitted to IESG for Publication
2018-03-18	01	Alexey Melnikov	IESG state changed to Publication Requested from AD is watching
2017-11-12	01	Murray Kucherawy	New version available: draft-kucherawy-dispatch-zstd-01.txt
2017-11-12	01	(System)	New version approved
2017-11-12	01	(System)	Request for posting confirmation emailed to previous authors: Yann Collet , Murray Kucherawy
2017-11-12	01	Murray Kucherawy	Uploaded new revision
2017-11-12	00	Alexey Melnikov	Assigned to Applications and Real-Time Area
2017-11-12	00	Alexey Melnikov	Responsible AD changed to Alexey Melnikov
2017-11-12	00	Alexey Melnikov	Intended Status changed to Informational
2017-11-12	00	Alexey Melnikov	IESG process started in state AD is watching
2017-11-12	00	Alexey Melnikov	Stream changed to IETF from None
2017-11-12	00	Murray Kucherawy	Added to session: IETF-100: dispatch Mon-0930
2017-09-25	00	Murray Kucherawy	New version available: draft-kucherawy-dispatch-zstd-00.txt
2017-09-25	00	(System)	New version approved
2017-09-25	00	Murray Kucherawy	Request for posting confirmation emailed to submitter and authors: Yann Collet , Murray Kucherawy , "Murray S. Kucherawy"
2017-09-25	00	Murray Kucherawy	Uploaded new revision

Zstandard Compression and the application/zstd Media Type draft-kucherawy-dispatch-zstd-03

Revision differences

Document history

Zstandard Compression and the application/zstd Media Type
draft-kucherawy-dispatch-zstd-03