Skip to main content

Updates to the CDDL grammar of RFC 8610
draft-ietf-cbor-update-8610-grammar-05

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft whose latest revision state is "Active".
Author Carsten Bormann
Last updated 2024-06-20 (Latest revision 2024-05-17)
Replaces draft-bormann-cbor-update-8610-grammar
RFC stream Internet Engineering Task Force (IETF)
Formats
Reviews
Additional resources Mailing list discussion
Stream WG state Submitted to IESG for Publication
Document shepherd Christian Amsüss
Shepherd write-up Show Last changed 2024-05-03
IESG IESG state Approved-announcement to be sent::Revised I-D Needed
Consensus boilerplate Yes
Telechat date (None)
Responsible AD Orie Steele
Send notices to christian@amsuess.com
IANA IANA review state IANA OK - No Actions Needed
draft-ietf-cbor-update-8610-grammar-05
CBOR                                                          C. Bormann
Internet-Draft                                    Universität Bremen TZI
Updates: 8610 (if approved)                                  17 May 2024
Intended status: Standards Track                                        
Expires: 18 November 2024

                Updates to the CDDL grammar of RFC 8610
                 draft-ietf-cbor-update-8610-grammar-05

Abstract

   The Concise Data Definition Language (CDDL), as defined in RFC 8610
   and RFC 9165, provides an easy and unambiguous way to express
   structures for protocol messages and data formats that are
   represented in CBOR or JSON.

   The present document updates RFC 8610 by addressing errata and making
   other small fixes for the ABNF grammar defined for CDDL there.

About This Document

   This note is to be removed before publishing as an RFC.

   The latest revision of this draft can be found at https://cbor-
   wg.github.io/update-8610-grammar/.  Status information for this
   document may be found at https://datatracker.ietf.org/doc/draft-ietf-
   cbor-update-8610-grammar/.

   Discussion of this document takes place on the CBOR Working Group
   mailing list (mailto:cbor@ietf.org), which is archived at
   https://mailarchive.ietf.org/arch/browse/cbor/.  Subscribe at
   https://www.ietf.org/mailman/listinfo/cbor/.

   Source for this draft and an issue tracker can be found at
   https://github.com/cbor-wg/update-8610-grammar.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

Bormann                 Expires 18 November 2024                [Page 1]
Internet-Draft            CDDL grammar updates                  May 2024

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 18 November 2024.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Conventions and Definitions . . . . . . . . . . . . . . .   3
   2.  Clarifications and Changes based on Errata Reports  . . . . .   3
     2.1.  Err6527 (text string literals)  . . . . . . . . . . . . .   3
     2.2.  Err6543 (byte string literals)  . . . . . . . . . . . . .   5
       Change proposed by Errata Report 6543 . . . . . . . . . . . .   5
       No change needed after addressing Err6527 (text string
           literals) (Section 2.1) . . . . . . . . . . . . . . . . .   6
   3.  Small Enabling Grammar Changes  . . . . . . . . . . . . . . .   8
     3.1.  Empty data models . . . . . . . . . . . . . . . . . . . .   8
     3.2.  Non-literal Tag Numbers, Simple Values  . . . . . . . . .   9
   4.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  10
   6.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  10
     6.1.  Normative References  . . . . . . . . . . . . . . . . . .  10
     6.2.  Informative References  . . . . . . . . . . . . . . . . .  11
   Appendix A.  Updated Collected ABNF for CDDL  . . . . . . . . . .  12
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  14
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  14

Bormann                 Expires 18 November 2024                [Page 2]
Internet-Draft            CDDL grammar updates                  May 2024

1.  Introduction

   The Concise Data Definition Language (CDDL), as defined in [RFC8610]
   and [RFC9165], provides an easy and unambiguous way to express
   structures for protocol messages and data formats that are
   represented in CBOR or JSON.

   The present document updates [RFC8610] by addressing errata and
   making other small fixes for the ABNF grammar defined for CDDL there.

1.1.  Conventions and Definitions

   The Terminology from [RFC8610] applies.  The grammar in [RFC8610] is
   based on ABNF, which is defined in [STD68] and [RFC7405].

2.  Clarifications and Changes based on Errata Reports

   A number of errata reports have been made around some details of text
   string and byte string literal syntax: [Err6527] and [Err6543].
   These are being addressed in this section, updating details of the
   ABNF for these literal syntaxes.  Also, [Err6526] needs to be applied
   (backslashes have been lost during RFC processing in some text
   explaining backslash escaping).

   These changes are intended to mirror the way existing implementations
   have dealt with the errata.  They also use the opportunity presented
   by the necessary cleanup of the grammar of string literals for a
   backward compatible addition to the syntax for hexadecimal escapes.
   The latter change is not automatically forward compatible (i.e., CDDL
   specifications that make use of this syntax do not necessarily work
   with existing implementations until these are updated, which this
   specification recommends).

2.1.  Err6527 (text string literals)

   The ABNF used in [RFC8610] for the content of text string literals is
   rather permissive:

   ; RFC 8610 ABNF:
   text = %x22 *SCHAR %x22
   SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC
   SESC = "\" (%x20-7E / %x80-10FFFD)

     Figure 1: Old ABNF for strings with permissive ABNF for SESC, but
                          not allowing hex escapes

Bormann                 Expires 18 November 2024                [Page 3]
Internet-Draft            CDDL grammar updates                  May 2024

   This allows almost any non-C0 character to be escaped by a backslash,
   but critically misses out on the \uXXXX and \uHHHH\uLLLL forms that
   JSON allows to specify characters in hex (which should be applying
   here according to Bullet 6 of Section 3.1 of [RFC8610]).  (Note that
   we import from JSON the unwieldy \uHHHH\uLLLL syntax, which
   represents Unicode code points beyond U+FFFF by making them look like
   UTF-16 surrogate pairs; CDDL text strings are not using UTF-16 or
   surrogates.)

   Both can be solved by updating the SESC production.  We use the
   opportunity to add a popular form of directly specifying characters
   in strings using hexadecimal escape sequences of the form \u{hex},
   where hex is the hexadecimal representation of the Unicode scalar
   value.  The result is the new set of rules defining SESC in Figure 2:

   ; new rules collectively defining SESC:
   SESC = "\" ( %x22 / "/" / "\" /                 ; \" \/ \\
                %x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t
                (%x75 hexchar) )                   ; \uXXXX
   hexchar = "{" (1*"0" [ hexscalar ] / hexscalar) "}" /
             non-surrogate / (high-surrogate "\" %x75 low-surrogate)
   non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) /
                   ("D" %x30-37 2HEXDIG )
   high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG
   low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG
   hexscalar = "10" 4HEXDIG / HEXDIG1 4HEXDIG
             / non-surrogate / 1*3HEXDIG
   HEXDIG1 = DIGIT1 / "A" / "B" / "C" / "D" / "E" / "F"

             Figure 2: Updated string ABNF to allow hex escapes

   (Notes: In ABNF, strings such as "A", "B" etc. are case-insensitive,
   as is intended here.  We could have written %x62 as %s"b", but
   didn't, in order to maximize ABNF tool compatibility.)

   Now that SESC is more restrictively formulated, this also requires an
   update to the BCHAR production used in the ABNF syntax for byte
   string literals:

   ; RFC 8610 ABNF:
   bytes = [bsqual] %x27 *BCHAR %x27
   BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
   bsqual = "h" / "b64"

                        Figure 3: Old ABNF for BCHAR

   With the SESC updated as above, \' is no longer allowed in BCHAR;
   this now needs to be explicitly included.

Bormann                 Expires 18 November 2024                [Page 4]
Internet-Draft            CDDL grammar updates                  May 2024

   Updating BCHAR also provides an opportunity to address [Err6278],
   which points to an inconsistency in treating U+007F (DEL) between
   SCHAR and BCHAR.  As U+007F is not printable, including it in a byte
   string literal is as confusing as for a text string literal, and it
   should therefore be excluded from BCHAR as it is from SCHAR.  The
   same reasoning also applies to the C1 control characters, so we
   actually exclude the entire range from U+007F to U+009F.  The same
   reasoning then also applies to text in comments (PCHAR).  For
   completeness, all these should also explicitly exclude the code
   points that have been set aside for UTF-16's surrogates.

   ; new rules for BCHAR and SCHAR:
   SCHAR = %x20-21 / %x23-5B / %x5D-7E / NONASCII / SESC
   BCHAR = %x20-26 / %x28-5B / %x5D-7E / NONASCII / SESC / "\'" / CRLF
   PCHAR = %x20-7E / NONASCII
   NONASCII = %xA0-D7FF / %xE000-10FFFD

             Figure 4: Updated ABNF for BCHAR, SCHAR, and PCHAR

   (Note that, apart from addressing the inconsistencies, there is no
   attempt to further exclude non-printable characters from the ABNF;
   doing this properly would draw in complexity from the ongoing
   evolution of the Unicode standard that is not needed here.)

2.2.  Err6543 (byte string literals)

   The ABNF used in [RFC8610] for the content of byte string literals
   lumps together byte strings notated as text with byte strings notated
   in base16 (hex) or base64 (but see also updated BCHAR production
   above):

   ; RFC 8610 ABNF:
   bytes = [bsqual] %x27 *BCHAR %x27
   BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF

                        Figure 5: Old ABNF for BCHAR

Change proposed by Errata Report 6543

   Errata report 6543 proposes to handle the two cases in separate
   productions (where, with an updated SESC, BCHAR obviously needs to be
   updated as above):

   ; Err6543 proposal:
   bytes = %x27 *BCHAR %x27
         / bsqual %x27 *QCHAR %x27
   BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
   QCHAR = DIGIT / ALPHA / "+" / "/" / "-" / "_" / "=" / WS

Bormann                 Expires 18 November 2024                [Page 5]
Internet-Draft            CDDL grammar updates                  May 2024

    Figure 6: Errata Report 8653 Proposal to Split the Byte String Rules

   This potentially causes a subtle change, which is hidden in the WS
   production:

   ; RFC 8610 ABNF:
   WS = SP / NL
   SP = %x20
   NL = COMMENT / CRLF
   COMMENT = ";" *PCHAR CRLF
   PCHAR = %x20-7E / %x80-10FFFD
   CRLF = %x0A / %x0D.0A

               Figure 7: ABNF definition of WS from RFC 8610

   This allows any non-C0 character in a comment, so this fragment
   becomes possible:

   foo = h'
      43424F52 ; 'CBOR'
      0A       ; LF, but don't use CR!
   '

   The current text is not unambiguously saying whether the three
   apostrophes need to be escaped with a \ or not, as in:

   foo = h'
      43424F52 ; \'CBOR\'
      0A       ; LF, but don\'t use CR!
   '

   ... which would be supported by the existing ABNF in [RFC8610].

No change needed after addressing Err6527 (text string literals)
(Section 2.1)

   This document takes the simpler approach of leaving the processing of
   the content of the byte string literal to a semantic step after
   processing the syntax of the bytes/BCHAR rules as updated by Figure 2
   and Figure 4.

   The rules in Figure 7 are therefore applied to the result of this
   processing where bsqual is given as h or b64.

Bormann                 Expires 18 November 2024                [Page 6]
Internet-Draft            CDDL grammar updates                  May 2024

   Note that this approach also works well with the use of byte strings
   in Section 3 of [RFC9165].  It does require some care when copy-
   pasting into CDDL models from ABNF that contains single quotes (which
   may also hide as apostrophes in comments); these need to be escaped
   or possibly replaced by %x27.

   Finally, our approach lends support to extending bsqual in CDDL
   similar to the way this is done for CBOR diagnostic notation in
   [I-D.ietf-cbor-edn-literals].  (Note that the processing of string
   literals now is quite similar between CDDL and EDN, except that CDDL
   has ";"-based end-of-line comments, while EDN has two comment
   syntaxes, in-line "/"-based and end-of-line "#"-based.)

   The CDDL example in Figure 8 demonstrates various escaping
   techniques.  Obviously in the literals for a and x, there is no need
   to escape the second character, an o, as \u{6f}; this is just for
   demonstration.  Similarly, as shown in c and z there also is no need
   to escape the 🁳 or ⌘, but escaping them may be convenient in order to
   limit the character repertoire of a CDDL file itself to ASCII
   [STD80].

   start = [a, b, c, x, y, z]

   ; "🁳", DOMINO TILE VERTICAL-02-02, and
   ; "⌘", PLACE OF INTEREST SIGN, in a text string:
   a = "D\u{6f}mino's \u{1F073} + \u{2318}"      ; \u{}-escape 3 chars
   b = "Domino's \uD83C\uDC73 + \u2318"          ; escape JSON-like
   c = "Domino's 🁳 + ⌘"                          ; unescaped

   ; in a byte string given as text, the ' needs to be escaped:
   x = 'D\u{6f}mino\u{27}s \u{1F073} + \u{2318}' ; \u{}-escape 4 chars
   y = 'Domino\'s \uD83C\uDC73 + \u2318'         ; escape JSON-like
   z = 'Domino\'s 🁳 + ⌘'                         ; escape ' only

   Figure 8: Example text and byte string literals with various escaping
                                 techniques

   In this example, the rules a to c and x to z all produce strings with
   byte-wise identical content, where a to c are text strings, and x to
   z are byte strings.  Figure 9 illustrates this by showing the output
   generated from the start rule in Figure 8, using pretty-printed
   hexadecimal.

Bormann                 Expires 18 November 2024                [Page 7]
Internet-Draft            CDDL grammar updates                  May 2024

   86                                      # array(6)
      73                                   # text(19)
         446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
      73                                   # text(19)
         446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
      73                                   # text(19)
         446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
      53                                   # bytes(19)
         446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
      53                                   # bytes(19)
         446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
      53                                   # bytes(19)
         446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"

                 Figure 9: Generated CBOR from CDDL example

3.  Small Enabling Grammar Changes

   The two subsections in this section specify two small changes to the
   grammar that are intended to enable certain kinds of specifications.
   These changes are backward compatible, i.e., CDDL files that comply
   to [RFC8610] continue to match the updated grammar, but not
   necessarily forward compatible, i.e., CDDL specifications that make
   use of these changes cannot necessarily be processed by existing
   [RFC8610] implementations.

3.1.  Empty data models

   [RFC8610] requires a CDDL file to have at least one rule.

   ; RFC 8610 ABNF:
   cddl = S 1*(rule S)

                Figure 10: Old ABNF for top-level rule cddl

   This makes sense when the file has to stand alone, as a CDDL data
   model needs to have at least one rule to provide an entry point
   (start rule).

   With CDDL modules [I-D.ietf-cbor-cddl-modules], CDDL files can also
   include directives, and these might be the source of all the rules
   that ultimately make up the module created by the file.  Any other
   rule content in the file has to be available for directive
   processing, making the requirement for at least one rule cumbersome.

   Therefore, we extend the grammar as in Figure 11 and make the
   existence of at least one rule a semantic constraint, to be fulfilled
   after processing of all directives.

Bormann                 Expires 18 November 2024                [Page 8]
Internet-Draft            CDDL grammar updates                  May 2024

   ; new top-level rule:
   cddl = S *(rule S)

              Figure 11: Updated ABNF for top-level rule cddl

3.2.  Non-literal Tag Numbers, Simple Values

   The existing ABNF syntax for expressing tags in CDDL is:

   ; extracted from RFC 8610 ABNF:
   type2 =/ "#" "6" ["." uint] "(" S type S ")"

                     Figure 12: Old ABNF for tag syntax

   This means tag numbers can only be given as literal numbers (uints).
   Some specifications operate on ranges of tag numbers, e.g., [RFC9277]
   has a range of tag numbers 1668546817 (0x63740101) to 1668612095
   (0x6374FFFF) to tag specific content formats.  This can currently not
   be expressed in CDDL.  Similar considerations apply to simple values
   (#7.xx).

   This update extends the syntax to:

   ; new rules collectively defining the tagged case:
   type2 =/ "#" "6" ["." head-number] "(" S type S ")"
          / "#" "7" ["." head-number]
   head-number = uint / ("<" type ">")

         Figure 13: Updated ABNF for tag and simple value syntaxes

   For #6, the head-number stands for the tag number.  For #7, the head-
   number stands for the simple value if it is in the ranges 0..23 or
   32..255 (as per Section 3.3 of RFC 8949 [STD94] the simple values
   24..31 are not used).  For 24..31, the head-number stands for the
   "additional information", e.g., #7.25 or #7.<25> is a float16, etc.
   (All ranges mentioned here are inclusive.)

   So the above range can be expressed in a CDDL fragment such as:

   ct-tag<content> = #6.<ct-tag-number>(content)
   ct-tag-number = 1668546817..1668612095
   ; or use 0x63740101..0x6374FFFF

   Notes:

Bormann                 Expires 18 November 2024                [Page 9]
Internet-Draft            CDDL grammar updates                  May 2024

   1.  This syntax reuses the angle bracket syntax for generics; this
       reuse is innocuous as a generic parameter/argument only ever
       occurs after a rule name (id), while it occurs after . here.
       (Whether there is potential for human confusion can be debated;
       the above example deliberately uses generics as well.)

   2.  The updated ABNF grammar makes it a bit more explicit that the
       number given after the optional dot is special, not giving the
       CBOR "additional information" for tags and simple values as it is
       with other uses of # in CDDL.  (Adding this observation to
       Section 2.2.3 of [RFC8610] is the subject of [Err6575]; it is
       correctly noted in Section 3.6 of [RFC8610].)  In hindsight,
       maybe a different character than the dot should have been chosen
       for this special case, however changing the grammar now would
       have been too disruptive.

4.  Security Considerations

   The grammar fixes and updates in this document are not believed to
   create additional security considerations.  The security
   considerations in Section 5 of [RFC8610] do apply, and specifically
   the potential for confusion is increased in an environment that uses
   a combination of CDDL tools some of which have been updated and some
   of which have not been, in particular based on Section 2.

5.  IANA Considerations

   This document has no IANA actions.

6.  References

6.1.  Normative References

   [RFC8610]  Birkholz, H., Vigano, C., and C. Bormann, "Concise Data
              Definition Language (CDDL): A Notational Convention to
              Express Concise Binary Object Representation (CBOR) and
              JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610,
              June 2019, <https://www.rfc-editor.org/rfc/rfc8610>.

   [STD68]    Internet Standard 68,
              <https://www.rfc-editor.org/info/std68>.
              At the time of writing, this STD comprises the following:

              Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
              Specifications: ABNF", STD 68, RFC 5234,
              DOI 10.17487/RFC5234, January 2008,
              <https://www.rfc-editor.org/info/rfc5234>.

Bormann                 Expires 18 November 2024               [Page 10]
Internet-Draft            CDDL grammar updates                  May 2024

   [STD94]    Internet Standard 94,
              <https://www.rfc-editor.org/info/std94>.
              At the time of writing, this STD comprises the following:

              Bormann, C. and P. Hoffman, "Concise Binary Object
              Representation (CBOR)", STD 94, RFC 8949,
              DOI 10.17487/RFC8949, December 2020,
              <https://www.rfc-editor.org/info/rfc8949>.

6.2.  Informative References

   [Err6278]  "Errata Report 6278", RFC 8610,
              <https://www.rfc-editor.org/errata/eid6278>.

   [Err6526]  "Errata Report 6526", RFC 8610,
              <https://www.rfc-editor.org/errata/eid6526>.

   [Err6527]  "Errata Report 6527", RFC 8610,
              <https://www.rfc-editor.org/errata/eid6527>.

   [Err6543]  "Errata Report 6543", RFC 8610,
              <https://www.rfc-editor.org/errata/eid6543>.

   [Err6575]  "Errata Report 6575", RFC 8610,
              <https://www.rfc-editor.org/errata/eid6575>.

   [I-D.ietf-cbor-cddl-modules]
              Bormann, C. and B. Moran, "CDDL Module Structure", Work in
              Progress, Internet-Draft, draft-ietf-cbor-cddl-modules-02,
              4 March 2024, <https://datatracker.ietf.org/doc/html/
              draft-ietf-cbor-cddl-modules-02>.

   [I-D.ietf-cbor-edn-literals]
              Bormann, C., "CBOR Extended Diagnostic Notation (EDN):
              Application-Oriented Literals, ABNF, and Media Type", Work
              in Progress, Internet-Draft, draft-ietf-cbor-edn-literals-
              08, 1 February 2024,
              <https://datatracker.ietf.org/doc/html/draft-ietf-cbor-
              edn-literals-08>.

   [RFC7405]  Kyzivat, P., "Case-Sensitive String Support in ABNF",
              RFC 7405, DOI 10.17487/RFC7405, December 2014,
              <https://www.rfc-editor.org/rfc/rfc7405>.

   [RFC9165]  Bormann, C., "Additional Control Operators for the Concise
              Data Definition Language (CDDL)", RFC 9165,
              DOI 10.17487/RFC9165, December 2021,
              <https://www.rfc-editor.org/rfc/rfc9165>.

Bormann                 Expires 18 November 2024               [Page 11]
Internet-Draft            CDDL grammar updates                  May 2024

   [RFC9277]  Richardson, M. and C. Bormann, "On Stable Storage for
              Items in Concise Binary Object Representation (CBOR)",
              RFC 9277, DOI 10.17487/RFC9277, August 2022,
              <https://www.rfc-editor.org/rfc/rfc9277>.

   [STD80]    Internet Standard 80,
              <https://www.rfc-editor.org/info/std80>.
              At the time of writing, this STD comprises the following:

              Cerf, V., "ASCII format for network interchange", STD 80,
              RFC 20, DOI 10.17487/RFC0020, October 1969,
              <https://www.rfc-editor.org/info/rfc20>.

Appendix A.  Updated Collected ABNF for CDDL

   This appendix is normative.

   It provides the full ABNF from [RFC8610] with the updates applied in
   the present document.

   cddl = S *(rule S)
   rule = typename [genericparm] S assignt S type
        / groupname [genericparm] S assigng S grpent

   typename = id
   groupname = id

   assignt = "=" / "/="
   assigng = "=" / "//="

   genericparm = "<" S id S *("," S id S ) ">"
   genericarg = "<" S type1 S *("," S type1 S ) ">"

   type = type1 *(S "/" S type1)

   type1 = type2 [S (rangeop / ctlop) S type2]
   ; space may be needed before the operator if type2 ends in a name

   type2 = value
         / typename [genericarg]
         / "(" S type S ")"
         / "{" S group S "}"
         / "[" S group S "]"
         / "~" S typename [genericarg]
         / "&" S "(" S group S ")"
         / "&" S groupname [genericarg]
         / "#" "6" ["." head-number] "(" S type S ")"
         / "#" "7" ["." head-number]

Bormann                 Expires 18 November 2024               [Page 12]
Internet-Draft            CDDL grammar updates                  May 2024

         / "#" DIGIT ["." uint]                ; major/ai
         / "#"                                 ; any
   head-number = uint / ("<" type ">")

   rangeop = "..." / ".."

   ctlop = "." id

   group = grpchoice *(S "//" S grpchoice)

   grpchoice = *(grpent optcom)

   grpent = [occur S] [memberkey S] type
          / [occur S] groupname [genericarg]  ; preempted by above
          / [occur S] "(" S group S ")"

   memberkey = type1 S ["^" S] "=>"
             / bareword S ":"
             / value S ":"

   bareword = id

   optcom = S ["," S]

   occur = [uint] "*" [uint]
         / "+"
         / "?"

   uint = DIGIT1 *DIGIT
        / "0x" 1*HEXDIG
        / "0b" 1*BINDIG
        / "0"

   value = number
         / text
         / bytes

   int = ["-"] uint

   ; This is a float if it has fraction or exponent; int otherwise
   number = hexfloat / (int ["." fraction] ["e" exponent ])
   hexfloat = ["-"] "0x" 1*HEXDIG ["." 1*HEXDIG] "p" exponent
   fraction = 1*DIGIT
   exponent = ["+"/"-"] 1*DIGIT

   text = %x22 *SCHAR %x22
   SCHAR = %x20-21 / %x23-5B / %x5D-7E / NONASCII / SESC

Bormann                 Expires 18 November 2024               [Page 13]
Internet-Draft            CDDL grammar updates                  May 2024

   SESC = "\" ( %x22 / "/" / "\" /                 ; \" \/ \\
                %x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t
                (%x75 hexchar) )                   ; \uXXXX

   hexchar = "{" (1*"0" [ hexscalar ] / hexscalar) "}" /
             non-surrogate / (high-surrogate "\" %x75 low-surrogate)
   non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) /
                   ("D" %x30-37 2HEXDIG )
   high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG
   low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG
   hexscalar = "10" 4HEXDIG / HEXDIG1 4HEXDIG
             / non-surrogate / 1*3HEXDIG

   bytes = [bsqual] %x27 *BCHAR %x27
   BCHAR = %x20-26 / %x28-5B / %x5D-7E / NONASCII / SESC / "\'" / CRLF
   bsqual = "h" / "b64"

   id = EALPHA *(*("-" / ".") (EALPHA / DIGIT))
   ALPHA = %x41-5A / %x61-7A
   EALPHA = ALPHA / "@" / "_" / "$"
   DIGIT = %x30-39
   DIGIT1 = %x31-39
   HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
   HEXDIG1 = DIGIT1 / "A" / "B" / "C" / "D" / "E" / "F"
   BINDIG = %x30-31

   S = *WS
   WS = SP / NL
   SP = %x20
   NL = COMMENT / CRLF
   COMMENT = ";" *PCHAR CRLF
   PCHAR = %x20-7E / NONASCII
   NONASCII = %xA0-D7FF / %xE000-10FFFD
   CRLF = %x0A / %x0D.0A

                    Figure 14: ABNF for CDDL as updated

Acknowledgments

   Many thanks go to the submitters of the errata reports addressed in
   this document.  In one of the ensuing discussions, Doug Ewell
   proposed to define an ABNF rule NONASCII, of which we have included
   the essence.  Special thanks to the reviewers Marco Tiloca, Christian
   Amsüss (shepherd review), and Orie Steele (AD review).

Author's Address

Bormann                 Expires 18 November 2024               [Page 14]
Internet-Draft            CDDL grammar updates                  May 2024

   Carsten Bormann
   Universität Bremen TZI
   Postfach 330440
   D-28359 Bremen
   Germany
   Phone: +49-421-218-63921
   Email: cabo@tzi.org

Bormann                 Expires 18 November 2024               [Page 15]