Skip to main content

Packed CBOR: Table permutations
draft-amsuess-cbor-packed-shuffle-00

Document Type Active Internet-Draft (individual)
Author Christian Amsüss
Last updated 2024-07-29
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-amsuess-cbor-packed-shuffle-00
cbor                                                           C. Amsüss
Internet-Draft                                              29 July 2024
Intended status: Standards Track                                        
Expires: 30 January 2025

                    Packed CBOR: Table permutations
                  draft-amsuess-cbor-packed-shuffle-00

Abstract

   Packed CBOR is a compression mechanism for Concise Binary Object
   Representation (CBOR) that can be used without a decompression step.
   This document introduces a means for altering configured tables to
   optimize the use of efficient encoding points by shuffling around
   entries in the packing tables.

About This Document

   This note is to be removed before publishing as an RFC.

   The latest revision of this draft can be found at
   https://chrysn.codeberg.page/packed-shuffle/draft-amsuess-cbor-
   packed-by-reference.html.  Status information for this document may
   be found at https://datatracker.ietf.org/doc/draft-amsuess-cbor-
   packed-shuffle/.

   Discussion of this document takes place on the CBOR Working Group
   mailing list (mailto:cbor@ietf.org), which is archived at
   https://mailarchive.ietf.org/arch/browse/cbor/.  Subscribe at
   https://www.ietf.org/mailman/listinfo/cbor/.

   Source for this draft and an issue tracker can be found at
   https://codeberg.org/chrysn/packed-shuffle.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

Amsüss                   Expires 30 January 2025                [Page 1]
Internet-Draft       Packed CBOR: Table permutations           July 2024

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 30 January 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Setting up the tables by reference  . . . . . . . . . . . . .   3
     2.1.  Permutation semantics . . . . . . . . . . . . . . . . . .   3
     2.2.  Examples  . . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  Security Considerations . . . . . . . . . . . . . . . . . . .   5
   4.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   5
     4.1.  CBOR Tags Registry  . . . . . . . . . . . . . . . . . . .   5
   5.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   6
     5.1.  Normative References  . . . . . . . . . . . . . . . . . .   6
     5.2.  Informative References  . . . . . . . . . . . . . . . . .   6
   Appendix A.  Example implementation . . . . . . . . . . . . . . .   6
   Appendix B.  Open issues  . . . . . . . . . . . . . . . . . . . .   7
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   8

1.  Introduction

   [ See abstract. ]

   This document uses the "CPA" convention of
   [I-D.bormann-cbor-draft-numbers].  If it reaches the RFC editing
   stage, as IANA processes the IANA Considerations, this note is to be
   removed, and all occurrences of "CPA" will have been replaced with
   allocated numbers.

Amsüss                   Expires 30 January 2025                [Page 2]
Internet-Draft       Packed CBOR: Table permutations           July 2024

2.  Setting up the tables by reference

   CBOR tag CPA115 describes a permutation of the CBOR packing tables
   around a rump.  Unlike tags CPA113 and most other table setup tags,
   it does not add items; instead, it modifies the table such that items
   with very short encoded lengths (most efficient are the first 16
   items of the shared table, and the first item of the argument items
   in straight use, and the first 8 items of the inverted items) can be
   used even when they were not originally assigned to those positions.

   Without excluding others, there are two groups of use cases
   envisioned:

   *  Rearranging preconfigured tables.

      If a media type comes with a pre-configured table, and/or when one
      ore more external tables are referenced (e.g. by using
      [I-D.amsuess-cbor-packed-by-reference]), a single tag CPA115 can
      be used around the main rump of the document to pull the most
      frequently used entries into the prominent positions.

   *  Local optimization.

      If some table entries are frequent in a sub-tree of the document,
      using tag CPA115 around that subtree can be beneficiall to overall
      compactness even when the table was set up locally (using tag
      CPA113) with an ordering suitable for the whole document.

   Packed-Shuffle = #6.<cpa115>([
       shuffle-shared,
       ?shuffle-argument,
       rump])
   rump = any
   shuffle-shared = shuffle
   shuffle-argument = shuffle
   shuffle = [+(offset, ?length)]
   offset = uint
   length = nint

   cpa115 = 115   ; preliminary value, see IANA considerations

2.1.  Permutation semantics

   Inside the rump, each table has a permutation applied compared to the
   outer table, described in the suffle-shared value for the shared
   table, and in the shuffle-argument value for the argument table
   (which defaults to the identical permutation).

Amsüss                   Expires 30 January 2025                [Page 3]
Internet-Draft       Packed CBOR: Table permutations           July 2024

   The applied permutations are described in inverse (so that a table
   index used in the rump gets mapped to a table index outside this
   tag): The item shuffle consecutively lists the lowest items of the
   inner table by indicating their postions in the outer table.  A
   negative number indicates that a group of in total (1 - length) items
   are included consecutively starting with the item before the negative
   number.  This creates a kind of run-length encoding.  At the end of
   the shuffle array, all items that were not referenced are appended in
   their original sequence from the outer table.

   An inner index can be calculated in a way suitable to constrained
   systems by applying the algorighm in Appendix A.

2.2.  Examples

   Outer table:
    A B C D E F G H

   Permutation CBOR item:
    [1 / pick the "B" /, 4 / pick the "E" /, -2 / "3 items" /]

   Inner table:
    B E F G A C D H

                  Figure 1: Single permutation illustrated

Amsüss                   Expires 30 January 2025                [Page 4]
Internet-Draft       Packed CBOR: Table permutations           July 2024

   1113([
     ["tick.", "tock.", "tickety."],
     ["The clock goes ", "And it goes "],
       [
       [
         6(simple(0)), / "The clock goes tick." /
         6(simple(1)), / "The clock goes tock." /
         224(simple(2)), / "And it goes tickety." /
         6(simple(1)), / "The clock goes tock." /
       ]
       115([[], [1], [
         / The argument table is now:
           ["And it goes ", "The clock goes "]
         /
         6(simple(0)), / "And it goes tick." /
         6(simple(1)), / "And it goes tock." /
         6(simple(2)), / "And it goes tickety." /
         6(simple(1)), / "And it goes tock." /
       ]]),
       [
         6(simple(1)), / "The clock goes tock." /
       ]
     ]
   ])

       Figure 2: Using permutations to make the 6() straight argument
                                 available.

   Note that in the tick/tock example, using 6() over 224() saves just 1
   byte, whereas the setup around tag CPA115 costs 6 bytes.  This
   particular use would break even at 6 uses of tag 6.

3.  Security Considerations

   [ I don't think there's anything to add? ]

   The security considerations of [I-D.ietf-cbor-packed] apply.

4.  IANA Considerations

4.1.  CBOR Tags Registry

   In the registry "CBOR Tags", IANA is requested to allocate one tag:

   *  Tag: CPA115

   *  Data item: Array [shuffle-shared, ?shuffle-argument, rump]

Amsüss                   Expires 30 January 2025                [Page 5]
Internet-Draft       Packed CBOR: Table permutations           July 2024

   *  Semantics: "Packed CBOR: table permutation"

   *  Reference: This document

5.  References

5.1.  Normative References

   [I-D.bormann-cbor-draft-numbers]
              Bormann, C., "Managing CBOR numbers in Internet-Drafts",
              Work in Progress, Internet-Draft, draft-bormann-cbor-
              draft-numbers-03, 29 February 2024,
              <https://datatracker.ietf.org/doc/html/draft-bormann-cbor-
              draft-numbers-03>.

   [I-D.ietf-cbor-packed]
              Bormann, C. and M. Gütschow, "Packed CBOR", Work in
              Progress, Internet-Draft, draft-ietf-cbor-packed-12, 2
              March 2024, <https://datatracker.ietf.org/doc/html/draft-
              ietf-cbor-packed-12>.

5.2.  Informative References

   [I-D.amsuess-cbor-packed-by-reference]
              Amsüss, C., "Packed CBOR: Table set up by reference", Work
              in Progress, Internet-Draft, draft-amsuess-cbor-packed-by-
              reference-02, 4 March 2024,
              <https://datatracker.ietf.org/doc/html/draft-amsuess-cbor-
              packed-by-reference-02>.

Appendix A.  Example implementation

   The following algorithm illustrates how table lookup can be
   implemented without the need for variable size storage per
   compression table.

   An implementation in fixed memory is expected to accommodate up to a
   fixed size of nested table setup tags.  When parsing the shuffle
   item, if may calculate early_item_count and store it for later use,
   along with a reference to the position of the shuffle table, through
   which it iterates again for every item to be unpacked.

Amsüss                   Expires 30 January 2025                [Page 6]
Internet-Draft       Packed CBOR: Table permutations           July 2024

   def early_item_count(shuffle):
       count = 0
       for (offset, length) in shuffle:
           length = 1 if length is None else 1 - length
           count += length
       return count

   def lookup(index, shuffle, outertable):
       shuffle_early_items = early_item_count(shuffle)
       if index < shuffle_early_items:
           for (offset, length) in shuffle:
               length = 1 if length is None else 1 - length
               if index < length:
                   return outertable[offset + index]
               index -= length
       else:
           outer_index = index - shuffle_early_items
           for (offset, length) in shuffle:
               length = 1 if length is None else 1 - length
               if offset <= outer_index:
                   outer_index += length
           return outertable[outer_index]

Appendix B.  Open issues

   *  For this to progress it will need some real use cases, not just
      synthetic examples.

   *  An unoptimized version of this can be achieved with tag CPA113
      from [I-D.ietf-cbor-packed], without the numeric encoding of items
      and the special encoding for consecutive items.  Using CPA113 this
      way is relatively easy for shared items; for argument items, it
      requires referencing them from the shared table or performing a
      no-op concatenation with an empty item.

   *  Is skipping of used values worth it?  If we accepted that this tag
      produced not a permutation but an extended copy of the outer
      table, we could do without the double iteration / pre-computation
      of the number of items.  This would be a simplification, and more
      similar to using CPA113.  The advantage of the permutation
      construction is that the non-optimized entries are not pushed back
      more than necessary, saving a bit (in some cases, literally) in
      the encoded size of those points.

Amsüss                   Expires 30 January 2025                [Page 7]
Internet-Draft       Packed CBOR: Table permutations           July 2024

      Is there maybe a different expression of the permutation that has
      the same nice properties added complexity?  (Stating the number of
      items in front of the shuffle item saves the double processing,
      but adds redundant information and does not help with the
      branching in the implementation).

   Whether this document should progress on from the initial version
   should depend on whether good use cases are identified and whether it
   outperforms alternative solutions by a sufficient margin.

Author's Address

   Christian Amsüss
   Austria
   Email: christian@amsuess.com

Amsüss                   Expires 30 January 2025                [Page 8]