Skip to main content

CBOR Extended Diagnostic Notation (EDN): Application-Oriented Literals, ABNF, and Media Type
draft-ietf-cbor-edn-literals-08

Document Type Active Internet-Draft (cbor WG)
Author Carsten Bormann
Last updated 2024-03-20 (Latest revision 2024-02-01)
Replaces draft-bormann-cbor-edn-literals
RFC stream Internet Engineering Task Force (IETF)
Intended RFC status (None)
Formats
Additional resources GitHub Repository
Mailing list discussion
Stream WG state WG Consensus: Waiting for Write-Up
Document shepherd Christian Amsüss
IESG IESG state I-D Exists
Consensus boilerplate Unknown
Telechat date (None)
Responsible AD (None)
Send notices to christian@amsuess.com
draft-ietf-cbor-edn-literals-08
Network Working Group                                         C. Bormann
Internet-Draft                                    Universität Bremen TZI
Intended status: Informational                           1 February 2024
Expires: 4 August 2024

CBOR Extended Diagnostic Notation (EDN): Application-Oriented Literals,
                          ABNF, and Media Type
                    draft-ietf-cbor-edn-literals-08

Abstract

   The Concise Binary Object Representation, CBOR (STD 94, RFC 8949),
   defines a "diagnostic notation" in order to be able to converse about
   CBOR data items without having to resort to binary data.

   This document specifies how to add application-oriented extensions to
   the diagnostic notation.  It then defines two such extensions for
   text representations of epoch-based date/times and of IP addresses
   and prefixes (RFC 9164).

   A few further additions close some gaps in usability.  To facilitate
   tool interoperation, this document specifies a formal ABNF definition
   for extended diagnostic notation (EDN) that accommodates application-
   oriented literals.

About This Document

   This note is to be removed before publishing as an RFC.

   The latest revision of this draft can be found at https://cbor-
   wg.github.io/edn-literal/.  Status information for this document may
   be found at https://datatracker.ietf.org/doc/draft-ietf-cbor-edn-
   literals/.

   Discussion of this document takes place on the cbor Working Group
   mailing list (mailto:cbor@ietf.org), which is archived at
   https://mailarchive.ietf.org/arch/browse/cbor/.  Subscribe at
   https://www.ietf.org/mailman/listinfo/cbor/.

   Source for this draft and an issue tracker can be found at
   https://github.com/cbor-wg/edn-literal.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

Bormann                   Expires 4 August 2024                 [Page 1]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 4 August 2024.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   4
     1.2.  (Non-)Objectives of this Document . . . . . . . . . . . .   4
   2.  Application-Oriented Extension Literals . . . . . . . . . . .   6
     2.1.  The "dt" Extension  . . . . . . . . . . . . . . . . . . .   7
     2.2.  The "ip" Extension  . . . . . . . . . . . . . . . . . . .   7
   3.  Stand-in Representations in Binary CBOR . . . . . . . . . . .   8
     3.1.  Handling unknown application-extension identifiers  . . .   9
     3.2.  Handling information deliberately elided from an EDN
           document  . . . . . . . . . . . . . . . . . . . . . . . .   9
   4.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  11
     4.1.  CBOR Diagnostic Notation Application-extension Identifiers
           Registry  . . . . . . . . . . . . . . . . . . . . . . . .  11
     4.2.  Encoding Indicators . . . . . . . . . . . . . . . . . . .  13
     4.3.  Media Type  . . . . . . . . . . . . . . . . . . . . . . .  14
     4.4.  Content-Format  . . . . . . . . . . . . . . . . . . . . .  15
     4.5.  Stand-in Tags . . . . . . . . . . . . . . . . . . . . . .  16
   5.  Security considerations . . . . . . . . . . . . . . . . . . .  16
   6.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  16
     6.1.  Normative References  . . . . . . . . . . . . . . . . . .  16

Bormann                   Expires 4 August 2024                 [Page 2]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

     6.2.  Informative References  . . . . . . . . . . . . . . . . .  18
   Appendix A.  ABNF Definitions . . . . . . . . . . . . . . . . . .  19
     A.1.  Overall ABNF Definition for Extended Diagnostic
           Notation  . . . . . . . . . . . . . . . . . . . . . . . .  19
     A.2.  ABNF Definitions for app-string Content . . . . . . . . .  23
       A.2.1.  h: ABNF Definition of Hexadecimal representation of a
               byte string . . . . . . . . . . . . . . . . . . . . .  24
       A.2.2.  b64: ABNF Definition of Base64 representation of a byte
               string  . . . . . . . . . . . . . . . . . . . . . . .  24
       A.2.3.  dt: ABNF Definition of RFC 3339 Representation of a
               Date/Time . . . . . . . . . . . . . . . . . . . . . .  25
       A.2.4.  ip: ABNF Definition of Textual Representation of an IP
               Address . . . . . . . . . . . . . . . . . . . . . . .  25
   Appendix B.  EDN and CDDL . . . . . . . . . . . . . . . . . . . .  26
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  28
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  28

1.  Introduction

   For the Concise Binary Object Representation, CBOR, Section 8 of RFC
   8949 [STD94] in conjunction with Appendix G of [RFC8610] defines a
   "diagnostic notation" in order to be able to converse about CBOR data
   items without having to resort to binary data.  Diagnostic notation
   syntax is based on JSON, with extensions for representing CBOR
   constructs such as binary data and tags.  (Standardizing this
   together with the actual interchange format does not serve to create
   another interchange format, but enables the use of a shared
   diagnostic notation in tools for and in documents about CBOR.)

   This document specifies how to add application-oriented extensions to
   the diagnostic notation.  It then defines two such extensions for
   text representations of epoch-based date/times and of IP addresses
   and prefixes [RFC9164].

   A few further additions close some gaps in usability.  To facilitate
   tool interoperation, this document specifies a formal ABNF definition
   for extended diagnostic notation (EDN) that accommodates application-
   oriented literals.  (See Appendix A.1 for an overall ABNF grammar as
   well as the ABNF definitions in Appendix A.2 for grammars for both
   the byte string presentations predefined in [STD94] and the
   application-extensions).

   In addition, this document finally registers a media type identifier
   and a content-format for CBOR diagnostic notation.  This does not
   elevate its status as an interchange format, but recognizes that
   interaction between tools is often smoother if media types can be
   used.

Bormann                   Expires 4 August 2024                 [Page 3]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

1.1.  Terminology

   Section 8 of RFC 8949 [STD94] defines the original CBOR diagnostic
   notation, and Appendix G of [RFC8610] supplies a number of extensions
   to the diagnostic notation that result in the Extended Diagnostic
   Notation (EDN).  The diagnostic notation extensions include popular
   features such as embedded CBOR (encoded CBOR data items in byte
   strings) and comments.  A simple diagnostic notation extension that
   enables representing CBOR sequences was added in Section 4.2 of
   [RFC8742].  As diagnostic notation is not used in the kind of
   interchange situations where backward compatibility would pose a
   significant obstacle, there is little point in not using these
   extensions.

   Therefore, when we refer to "_diagnostic notation_", we mean to
   include the original notation from Section 8 of RFC 8949 [STD94] as
   well as the extensions from Appendix G of [RFC8610], Section 4.2 of
   [RFC8742], and the present document.  However, we stick to the
   abbreviation "_EDN_" as it has become quite popular and is more
   sharply distinguishable from other meanings than "DN" would be.

   In a similar vein, the term "ABNF" in this document refers to the
   language defined in [STD68] as extended in [RFC7405], where the
   "characters" of Section 2.3 of RFC 5234 [STD68] are Unicode scalar
   values.  The term "CDDL" refers to the data definition language
   defined in [RFC8610] and its registered extensions (such as those in
   [RFC9165]), as well as [I-D.ietf-cbor-update-8610-grammar].

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

1.2.  (Non-)Objectives of this Document

   Section 8 of RFC 8949 [STD94] states the objective of defining a
   human-readable diagnostic notation with CBOR.  In particular, it
   states:

   |  All actual interchange always happens in the binary format.

Bormann                   Expires 4 August 2024                 [Page 4]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   One important application of EDN is the notation of CBOR data for
   humans: in specifications, on whiteboards, and for entering test
   data.  A number of features, such as comments in string literals, are
   mainly useful for people-to-people communication via EDN.  Programs
   also often output EDN for diagnostic purposes, such as in error
   messages or to enable comparison (including generation of diffs via
   tools) with test data.

   For comparison with test data, it is often useful if different
   implementations generate the same (or similar) output for the same
   CBOR data items.  This is comparable to the objectives of
   deterministic serialization for CBOR data items themselves
   (Section 4.2 of RFC 8949 [STD94]).  However, there are even more
   representation variants in EDN than in binary CBOR, and there is
   little point in specifically endorsing a single variant as
   "deterministic" when other variants may be more useful for human
   understanding, e.g., the << >> notation as opposed to h''; an EDN
   generator may have quite a few options that control what presentation
   variant is most desirable for the application that it is being used
   for.

   Because of this, a deterministic representation is not defined for
   EDN, and there is no expectation for "roundtripping" from EDN to CBOR
   and back, i.e., for an ability to convert EDN to binary CBOR and back
   to EDN while achieving exactly the same result as the original input
   EDN — the original EDN possibly was created by humans or by a
   different EDN generator.

   However, there is a certain expectation that EDN generators can be
   configured to some basic output format, which:

   *  looks like JSON where that is possible;

   *  inserts encoding indicators only where the binary form differs
      from preferred encoding;

   *  uses hexadecimal representation (h'') for byte strings, not b64''
      or embedded CBOR (<<>>);

   *  does not generate elaborate blank space (newlines, indentation)
      for pretty-printing, but does use common blank spaces such as
      after , and :.

Bormann                   Expires 4 August 2024                 [Page 5]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   Additional features such as ensuring deterministic map ordering
   (Section 4.2 of RFC 8949 [STD94]) on output, or even deviating from
   the basic configuration in some systematic way, can further assist in
   comparing test data.  Information obtained from a CDDL model can help
   in choosing application-oriented literals or specific string
   representations such as embedded CBOR or b64'' in the appropriate
   places.

2.  Application-Oriented Extension Literals

   This document extends the syntax used in diagnostic notation for byte
   string literals to also be available for application-oriented
   extensions.

   As per Section 8 of RFC 8949 [STD94], the diagnostic notation can
   notate byte strings in a number of [RFC4648] base encodings, where
   the encoded text is enclosed in single quotes, prefixed by an
   identifier (»h« for base16, »b32« for base32, »h32« for base32hex,
   »b64« for base64 or base64url).

   This syntax can be thought to establish a name space, with the names
   "h", "b32", "h32", and "b64" taken, but other names being
   unallocated.  The present specification defines additional names for
   this namespace, which we call _application-extension identifiers_.
   For the quoted string, the same rules apply as for byte strings.  In
   particular, the escaping rules that were adapted from JSON strings
   are applied equivalently for application-oriented extensions, e.g.,
   within the quoted string \\ stands for a single backslash and \'
   stands for a single quote.

   An application-extension identifier is a name consisting of a lower-
   case ASCII letter (a-z) and zero or more additional ASCII characters
   that are either lower-case letters or digits (a-z0-9).

   Application-extension identifiers are registered in a registry
   (Section 4.1).

   Prefixing a single-quoted string, an application-extension identifier
   is used to build an application-oriented extension literal, which
   stands for a CBOR data item the value of which is derived from the
   text given in the single-quoted string using a procedure defined in
   the specification for an application-extension identifier.

   An application-extension (such as dt) MAY also define the meaning of
   a variant of the application-extension identifier where each lower-
   case character is replaced by its upper-case counterpart (such as
   DT), for building an application-oriented extension literal using
   that all-uppercase variant as the prefix of a single-quoted string.

Bormann                   Expires 4 August 2024                 [Page 6]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   As a convention for such definitions, using the all-uppercase variant
   implies making use of a tag appropriate for this application-oriented
   extension (such as tag number 1 for DT).

   Examples for application-oriented extensions to CBOR diagnostic
   notation can be found in the following sections.

2.1.  The "dt" Extension

   The application-extension identifier "dt" is used to notate a date/
   time literal that can be used as an Epoch-Based Date/Time as per
   Section 3.4.2 of RFC 8949 [STD94].

   The text of the literal is a Standard Date/Time String as per
   Section 3.4.1 of RFC 8949 [STD94].

   The value of the literal is a number representing the result of a
   conversion of the given Standard Date/Time String to an Epoch-Based
   Date/Time.  If fractional seconds are given in the text (production
   time-secfrac in Figure 4), the value is a floating-point number; the
   value is an integer number otherwise.  In the all-upper-case variant
   of the app-prefix, the value is enclosed in a tag number 1.

   As an example, the CBOR diagnostic notation

   dt'1969-07-21T02:56:16Z',
   dt'1969-07-21T02:56:16.5Z',
   DT'1969-07-21T02:56:16Z'

   is equivalent to

   -14159024,
   -14159023.5,
   1(-14159024)

   See Appendix A.2.3 for an ABNF definition for the content of dt
   literals.

2.2.  The "ip" Extension

   The application-extension identifier "ip" is used to notate an IP
   address literal that can be used as an IP address as per Section 3 of
   [RFC9164].

   The text of the literal is an IPv4address or IPv6address as per
   Section 3.2.2 of [RFC3986].

Bormann                   Expires 4 August 2024                 [Page 7]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   With the lower-case app-string ip, the value of the literal is a byte
   string representing the binary IP address.  With the upper-case app-
   string IP, the literal is such a byte string tagged with tag number
   54, if an IPv6address is used, or tag number 52, if an IPv4address is
   used.

   As an additional case, the upper-case app-string IP'' can be used
   with a prefix such as 2001:db8::/56 or 192.0.2.0/24, with the
   equivalent tag as its value.  (Note that [RFC9164] representations of
   address prefixes need to implement the truncation of the address byte
   string as described in Section 4.2 of [RFC9164]; see example below.)
   For completeness, the lower-case variant ip'2001:db8::/56' or
   ip'192.0.2.0/24' stands for an unwrapped [56,h'20010db8'] or
   [24,h'c00002']; however, in this case the information on whether an
   address is IPv4 or IPv6 often needs to come from the context.

   Note that there is no direct representation of an address combined
   with a prefix length; this can be represented as
   52([ip'192.0.2.42',24]), if needed.

   Examples: the CBOR diagnostic notation

   ip'192.0.2.42',
   IP'192.0.2.42',
   IP'192.0.2.0/24',
   ip'2001:db8::42',
   IP'2001:db8::42',
   IP'2001:db8::/64'

   is equivalent to

   h'c000022a',
   52(h'c000022a'),
   52([24,h'c00002']),
   h'20010db8000000000000000000000042',
   54(h'20010db8000000000000000000000042'),
   54([64,h'20010db8'])

   See Appendix A.2.4 for an ABNF definition for the content of ip
   literals.

3.  Stand-in Representations in Binary CBOR

   In some cases, an EDN consumer cannot construct actual CBOR items
   that represent the CBOR data intended for eventual interchange.  This
   document defines stand-in representation for two such cases:

Bormann                   Expires 4 August 2024                 [Page 8]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   *  The EDN consumer does not know (or does not implement) an
      application-extension identifier used in the EDN document
      (Section 3.1) but wants to preserve the information for a later
      processor.

   *  The generator of some EDN intended for human consumption (such as
      in a specification document) may not want to include parts of the
      final data item, destructively replacing complete subtrees or
      possibly just parts of a lengthy string by _elisions_
      (Section 3.2).

3.1.  Handling unknown application-extension identifiers

   When ingesting CBOR diagnostic notation, any application-oriented
   extension literals are usually decoded and transformed into the
   corresponding data item during ingestion.  If an application-
   extension is not known or not implemented by the ingesting process,
   this is usually an error and processing has to stop.

   However, in certain cases, it can be desirable to exceptionally carry
   an uninterpreted application-oriented extension literal in an
   ingested data item, allowing to postpone its decoding to a specific
   later stage of ingestion.

   This specification defines a CBOR Tag for this purpose: The
   Diagnostic Notation Unresolved Application-Extension Tag, tag number
   CPA999 (Section 4.5).  The content of this tag is an array of two
   text strings: The application-extension identifier, and the (escape-
   processed) content of the single-quoted string.  For example,
   dt'1969-07-21T02:56:16Z' can be provisionally represented as /CPA/
   999(["dt", "1969-07-21T02:56:16Z"]).

   // RFC-Editor: This document uses the CPA (code point allocation)
   // convention described in [I-D.bormann-cbor-draft-numbers].  For
   // each usage of the term "CPA", please remove the prefix "CPA" from
   // the indicated value and replace the residue with the value
   // assigned by IANA; perform an analogous substitution for all other
   // occurrences of the prefix "CPA" in the document.  Finally, please
   // remove this note.

3.2.  Handling information deliberately elided from an EDN document

   When using EDN for exposition in a document or on a whiteboard, it is
   often useful to be able to leave out parts of an EDN document that
   are not of interest at that point of the exposition.

Bormann                   Expires 4 August 2024                 [Page 9]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   To facilitate this, this specification supports the use of an
   _ellipsis_ (notated as three or more dots in a row, as in ...) to
   indicate parts of an EDN document that have been elided (and
   therefore cannot be reconstructed).

   Upon ingesting EDN as a representation of a CBOR data item for
   further processing, the occurrence of an ellipsis usually is an error
   and processing has to stop.

   However, it is useful to be able to process EDN documents with
   ellipses in the automation scripts for the documents using them.
   This specification defines a CBOR Tag that can be used in the
   ingestion for this purpose: The Diagnostic Notation Ellipsis Tag, tag
   number CPA888 (Section 4.5).  The content of this tag either is

   1.  null (indicating a data item entirely replaced by an ellipsis),
       or it is

   2.  an array, the elements of which are alternating between fragments
       of a string and the actual elisions, represented as ellipses
       carrying a null as content.

   Elisions can stand in for entire subtrees, e.g. in:

   [1, 2, ..., 3]
   ,
   { "a": 1,
     "b": ...,
     ...: ...
   }

   A single ellipsis (or key/value pair of ellipses) can imply eliding
   multiple elements in an array (members in a map); if more detailed
   control is required, a data definition language such as CDDL can be
   employed.  (Note that the stand-in form defined here does not allow
   multiple key/value pairs with an ellipsis as a key: the CBOR data
   item would not be valid.)

   Subtree elisions can be represented in a CBOR data item by using
   /CPA/888(null) as the stand-in:

   [1, 2, 888(null), 3]
   ,
   { "a": 1,
     "b": 888(null),
     888(null): 888(null)
   }

Bormann                   Expires 4 August 2024                [Page 10]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   Elisions also can be used as part of a (text or byte) string:

   { "contract": "Herewith I buy" ... "gned: Alice & Bob",
     "signature": h'4711...0815',
   }

   The example "contract" uses string concatenation as per Appendix G.4
   of [RFC8610], extending that by allowing ellipses; while the example
   "signature" uses special syntax that allows the use of ellipses
   between the bytes notated _inside_ h'' literals.

   String elisions can be represented in a CBOR data item by a stand-in
   that wraps an array of string fragments alternating with ellipsis
   indicators:

   { "contract": /CPA/888(["Herewith I buy", 888(null),
                           "gned: Alice & Bob"]),
     "signature": 888([h'4711', 888(null), h'0815']),
   }

   Note that the use of elisions is different from "commenting out" EDN
   text, e.g.

   { "contract": "Herewith I buy" /.../ "gned: Alice & Bob",
     "signature": h'4711/.../0815',
     # ...: ...
   }

   The consumer of this EDN will ignore the comments and therefore will
   have no idea after ingestion that some information has been elided;
   validation steps may then simply fail instead of being informed about
   the elisions.

4.  IANA Considerations

   // RFC Editor: please replace RFC-XXXX with the RFC number of this
   // RFC, [IANA.cbor-diagnostic-notation] with a reference to the new
   // registry group, and remove this note.

4.1.  CBOR Diagnostic Notation Application-extension Identifiers
      Registry

   IANA is requested to create an "Application-Extension Identifiers"
   registry in a new "CBOR Diagnostic Notation" registry group
   [IANA.cbor-diagnostic-notation], with the policy "expert review"
   (Section 4.5 of RFC 8126 [BCP26]).

Bormann                   Expires 4 August 2024                [Page 11]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   The experts are instructed to be frugal in the allocation of
   application-extension identifiers that are suggestive of generally
   applicable semantics, keeping them in reserve for application-
   extensions that are likely to enjoy wide use and can make good use of
   their conciseness.  The expert is also instructed to direct the
   registrant to provide a specification (Section 4.6 of RFC 8126
   [BCP26]), but can make exceptions, for instance when a specification
   is not available at the time of registration but is likely
   forthcoming.  If the expert becomes aware of application-extension
   identifiers that are deployed and in use, they may also initiate a
   registration on their own if they deem such a registration can avert
   potential future collisions.

   Each entry in the registry must include:

   Application-Extension Identifier:
      a lower case ASCII [STD80] string that starts with a letter and
      can contain letters and digits after that ([a-z][a-z0-9]*).  No
      other entry in the registry can have the same application-
      extension identifier.

   Description:
      a brief description

   Change Controller:
      (see Section 2.3 of RFC 8126 [BCP26])

   Reference:
      a reference document that provides a description of the
      application-extension identifier

   The initial content of the registry is shown in Table 1; all initial
   entries have the Change Controller "IETF".

Bormann                   Expires 4 August 2024                [Page 12]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   +==================================+===================+===========+
   | Application-extension Identifier | Description       | Reference |
   +==================================+===================+===========+
   | h                                | Reserved          | RFC8949   |
   +----------------------------------+-------------------+-----------+
   | b32                              | Reserved          | RFC8949   |
   +----------------------------------+-------------------+-----------+
   | h32                              | Reserved          | RFC8949   |
   +----------------------------------+-------------------+-----------+
   | b64                              | Reserved          | RFC8949   |
   +----------------------------------+-------------------+-----------+
   | dt                               | Date/Time         | RFC-XXXX  |
   +----------------------------------+-------------------+-----------+
   | ip                               | IP Address/Prefix | RFC-XXXX  |
   +----------------------------------+-------------------+-----------+

       Table 1: Initial Content of Application-extension Identifier
                                 Registry

4.2.  Encoding Indicators

   IANA is requested to create an "Encoding Indicators" registry in the
   newly created "CBOR Diagnostic Notation" registry group [IANA.cbor-
   diagnostic-notation], with the policy "specification required"
   (Section 4.6 of RFC 8126 [BCP26]).

   The experts are instructed to be frugal in the allocation of encoding
   indicators that are suggestive of generally applicable semantics,
   keeping them in reserve for encoding indicator registrations that are
   likely to enjoy wide use and can make good use of their conciseness.
   If the expert becomes aware of encoding indicators that are deployed
   and in use, they may also solicit a specification and initiate a
   registration on their own if they deem such a registration can avert
   potential future collisions.

   Each entry in the registry must include:

   Encoding Indicator:
      an ASCII [STD80] string that starts with an underscore letter and
      can contain zero or more underscores, letters and digits after
      that (_[_A-Za-z0-9]*).  No other entry in the registry can have
      the same Encoding Indicator.

   Description:
      a brief description

   Change Controller:
      (see Section 2.3 of RFC 8126 [BCP26])

Bormann                   Expires 4 August 2024                [Page 13]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   Reference:
      a reference document that provides a description of the
      application-extension identifier

   The initial content of the registry is shown in Table 2; all initial
   entries have the Change Controller "IETF".

          +====================+===================+===========+
          | Encoding Indicator | Description       | Reference |
          +====================+===================+===========+
          | _                  | Indefinite Length | RFC8949,  |
          |                    | Encoding (ai=31)  | RFC-XXXX  |
          +--------------------+-------------------+-----------+
          | _i                 | ai=0 to ai=23     | RFC-XXXX  |
          +--------------------+-------------------+-----------+
          | _0                 | ai=24             | RFC8949,  |
          |                    |                   | RFC-XXXX  |
          +--------------------+-------------------+-----------+
          | _1                 | ai=25             | RFC8949,  |
          |                    |                   | RFC-XXXX  |
          +--------------------+-------------------+-----------+
          | _2                 | ai=26             | RFC8949,  |
          |                    |                   | RFC-XXXX  |
          +--------------------+-------------------+-----------+
          | _3                 | ai=27             | RFC8949,  |
          |                    |                   | RFC-XXXX  |
          +--------------------+-------------------+-----------+

              Table 2: Initial Content of Encoding Indicator
                                 Registry

4.3.  Media Type

   IANA is requested to add the following Media-Type to the "Media
   Types" registry [IANA.media-types].

   +=================+=============================+=============+
   | Name            | Template                    | Reference   |
   +=================+=============================+=============+
   | cbor-diagnostic | application/cbor-diagnostic | RFC-XXXX,   |
   |                 |                             | Section 4.3 |
   +-----------------+-----------------------------+-------------+

         Table 3: New Media Type application/cbor-diagnostic

   Type name:  application
   Subtype name:  cbor-diagnostic
   Required parameters:  N/A

Bormann                   Expires 4 August 2024                [Page 14]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   Optional parameters:  N/A
   Encoding considerations:  binary (UTF-8)
   Security considerations:  Section 5 of RFC XXXX
   Interoperability considerations:  none
   Published specification:  Section 4.3 of RFC XXXX
   Applications that use this media type:  Tools interchanging a human-
      readable form of CBOR
   Fragment identifier considerations:  The syntax and semantics of
      fragment identifiers is as specified for "application/cbor".  (At
      publication of RFC XXXX, there is no fragment identification
      syntax defined for "application/cbor".)
   Additional information:
      Deprecated alias names for this type:  N/A

      Magic number(s):  N/A

      File extension(s):  .diag

      Macintosh file type code(s):  N/A
   Person & email address to contact for further information:  CBOR WG
      mailing list (cbor@ietf.org), or IETF Applications and Real-Time
      Area (art@ietf.org)
   Intended usage:  LIMITED USE
   Restrictions on usage:  CBOR diagnostic notation represents CBOR data
      items, which are the format intended for actual interchange.  The
      media type application/cbor-diagnostic is intended to be used
      within documents about CBOR data items, in diagnostics for human
      consumption, and in other representations of CBOR data items that
      are necessarily text-based such as in configuration files or other
      data edited by humans, often under source-code control.
   Author/Change controller:  IETF
   Provisional registration:  no

4.4.  Content-Format

   IANA is requested to register a Content-Format number in the "CoAP
   Content-Formats" sub-registry, within the "Constrained RESTful
   Environments (CoRE) Parameters" Registry [IANA.core-parameters], as
   follows:

   +=============================+================+======+===========+
   | Content-Type                | Content Coding | ID   | Reference |
   +=============================+================+======+===========+
   | application/cbor-diagnostic | -              | TBD1 | RFC-XXXX  |
   +-----------------------------+----------------+------+-----------+

                       Table 4: New Content-Format

Bormann                   Expires 4 August 2024                [Page 15]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   TBD1 is to be assigned from the space 256..999.

4.5.  Stand-in Tags

   // RFC-Editor: This document uses the CPA (code point allocation)
   // convention described in [I-D.bormann-cbor-draft-numbers].  For
   // each usage of the term "CPA", please remove the prefix "CPA" from
   // the indicated value and replace the residue with the value
   // assigned by IANA; perform an analogous substitution for all other
   // occurrences of the prefix "CPA" in the document.  Finally, please
   // remove this note.

   In the "CBOR Tags" registry [IANA.cbor-tags], IANA is requested to
   assign the tags in Table 5 from the "specification required" space
   (suggested assignments: 888 and 999), with the present document as
   the specification reference.

   +========+===========+==================================+===========+
   |    Tag | Data      | Semantics                        | Reference |
   |        | Item      |                                  |           |
   +========+===========+==================================+===========+
   | CPA888 | null or   | Diagnostic Notation Ellipsis     | RFC-XXXX  |
   |        | array     |                                  |           |
   +--------+-----------+----------------------------------+-----------+
   | CPA999 | array     | Diagnostic Notation              | RFC-XXXX  |
   |        |           | Unresolved Application-Extension |           |
   +--------+-----------+----------------------------------+-----------+

                          Table 5: Values for Tags

5.  Security considerations

   The security considerations of [STD94] and [RFC8610] apply.

6.  References

6.1.  Normative References

   [BCP26]    Cotton, M., Leiba, B., and T. Narten, "Guidelines for
              Writing an IANA Considerations Section in RFCs", BCP 26,
              RFC 8126, June 2017.

              <https://www.rfc-editor.org/info/bcp26>

   [C]        International Organization for Standardization,
              "Information technology — Programming languages — C",
              Fourth Edition, ISO/IEC 9899:2018, June 2018,

Bormann                   Expires 4 August 2024                [Page 16]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

              <https://www.iso.org/standard/74528.html>.  The text of
              the standard is also available via
              https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf

   [Cplusplus]
              International Organization for Standardization,
              "Programming languages — C++", Sixth Edition, ISO/
              IEC 14882:2020, December 2020,
              <https://www.iso.org/standard/79358.html>.  The text of
              the standard is also available via
              https://isocpp.org/files/papers/N4860.pdf

   [IANA.cbor-tags]
              IANA, "Concise Binary Object Representation (CBOR) Tags",
              <https://www.iana.org/assignments/cbor-tags>.

   [IANA.core-parameters]
              IANA, "Constrained RESTful Environments (CoRE)
              Parameters",
              <https://www.iana.org/assignments/core-parameters>.

   [IANA.media-types]
              IANA, "Media Types",
              <https://www.iana.org/assignments/media-types>.

   [IEEE754]  IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE
              Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229,
              <https://ieeexplore.ieee.org/document/8766229>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

   [RFC3339]  Klyne, G. and C. Newman, "Date and Time on the Internet:
              Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002,
              <https://www.rfc-editor.org/rfc/rfc3339>.

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66,
              RFC 3986, DOI 10.17487/RFC3986, January 2005,
              <https://www.rfc-editor.org/rfc/rfc3986>.

   [RFC7405]  Kyzivat, P., "Case-Sensitive String Support in ABNF",
              RFC 7405, DOI 10.17487/RFC7405, December 2014,
              <https://www.rfc-editor.org/rfc/rfc7405>.

Bormann                   Expires 4 August 2024                [Page 17]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

   [RFC8610]  Birkholz, H., Vigano, C., and C. Bormann, "Concise Data
              Definition Language (CDDL): A Notational Convention to
              Express Concise Binary Object Representation (CBOR) and
              JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610,
              June 2019, <https://www.rfc-editor.org/rfc/rfc8610>.

   [RFC8742]  Bormann, C., "Concise Binary Object Representation (CBOR)
              Sequences", RFC 8742, DOI 10.17487/RFC8742, February 2020,
              <https://www.rfc-editor.org/rfc/rfc8742>.

   [RFC9164]  Richardson, M. and C. Bormann, "Concise Binary Object
              Representation (CBOR) Tags for IPv4 and IPv6 Addresses and
              Prefixes", RFC 9164, DOI 10.17487/RFC9164, December 2021,
              <https://www.rfc-editor.org/rfc/rfc9164>.

   [STD68]    Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
              Specifications: ABNF", STD 68, RFC 5234, January 2008.

              <https://www.rfc-editor.org/info/std68>

   [STD80]    Cerf, V., "ASCII format for network interchange", STD 80,
              RFC 20, October 1969.

              <https://www.rfc-editor.org/info/std80>

   [STD94]    Bormann, C. and P. Hoffman, "Concise Binary Object
              Representation (CBOR)", STD 94, RFC 8949, December 2020.

              <https://www.rfc-editor.org/info/std94>

6.2.  Informative References

   [I-D.ietf-cbor-update-8610-grammar]
              Bormann, C., "Updates to the CDDL grammar of RFC 8610",
              Work in Progress, Internet-Draft, draft-ietf-cbor-update-
              8610-grammar-03, 29 January 2024,
              <https://datatracker.ietf.org/doc/html/draft-ietf-cbor-
              update-8610-grammar-03>.

   [RFC4648]  Josefsson, S., "The Base16, Base32, and Base64 Data
              Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006,
              <https://www.rfc-editor.org/rfc/rfc4648>.

Bormann                   Expires 4 August 2024                [Page 18]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   [RFC9165]  Bormann, C., "Additional Control Operators for the Concise
              Data Definition Language (CDDL)", RFC 9165,
              DOI 10.17487/RFC9165, December 2021,
              <https://www.rfc-editor.org/rfc/rfc9165>.

   [STD90]    Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
              Interchange Format", STD 90, RFC 8259, December 2017.

              <https://www.rfc-editor.org/info/std90>

Appendix A.  ABNF Definitions

   This appendix collects grammars in ABNF form ([STD68] as extended in
   [RFC7405]) that serve to define the syntax of EDN and some
   application-oriented literals.

   Implementation note: The ABNF definitions in this appendix are
   intended to be useful in a PEG parser interpretation (see Appendix A
   of [RFC8610] for an introduction into PEG).

A.1.  Overall ABNF Definition for Extended Diagnostic Notation

   This appendix provides an overall ABNF definition for the syntax of
   CBOR extended diagnostic notation.

   To complete the parsing of an app-string with prefix, say, p, the
   processed sqstr inside it is further parsed using the ABNF definition
   specified for the production app-string-p in Appendix A.2.

   For simplicity, the internal parsing for the built-in EDN prefixes is
   specified in the same way.  ABNF definitions for h'' and b64'' are
   provided in Appendix A.2.1 and Appendix A.2.2.  However, the prefixes
   b32'' and h32'' are not in wide use and an ABNF definition in this
   document could therefore not be based on implementation experience.

   seq             = S [item S *("," S item S) OC] S
   one-item        = S item S
   item            = map / array / tagged
                   / number / simple
                   / string / streamstring

   string1         = (tstr / bstr) spec
   string1e        = string1 / ellipsis
   ellipsis        = 3*"." ; "..." or more dots
   string          = string1e *(S string1e)

   number          = (basenumber / decnumber / infin) spec
   sign            = "+" / "-"

Bormann                   Expires 4 August 2024                [Page 19]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   decnumber       = [sign] (1*DIGIT ["." *DIGIT] / "." 1*DIGIT)
                            ["e" [sign] 1*DIGIT]
   basenumber      = [sign] "0" ("x" 1*HEXDIG
                                 [["." *HEXDIG] "p" [sign] 1*DIGIT]
                               / "x" "." 1*HEXDIG "p" [sign] 1*DIGIT
                               / "o" 1*ODIGIT
                               / "b" 1*BDIGIT)
   infin           = %s"Infinity"
                   / %s"-Infinity"
                   / %s"NaN"
   simple          = %s"false"
                   / %s"true"
                   / %s"null"
                   / %s"undefined"
                   / %s"simple(" S item S ")"
   uint            = "0" / DIGIT1 *DIGIT
   tagged          = uint spec "(" S item S ")"

   app-prefix      = lcalpha *lcalnum ; including h and b64
                   / ucalpha *ucalnum ; tagged variant, if defined
   app-string      = app-prefix sqstr
   sqstr           = "'" *single-quoted "'"
   bstr            = app-string / sqstr / embedded
                     ; app-string could be any type
   tstr            = DQUOTE *double-quoted DQUOTE
   embedded        = "<<" seq ">>"

   array           = "[" spec S [item S *("," S item S) OC] "]"
   map             = "{" spec S [kp S *("," S kp S) OC] "}"
   kp              = item S ":" S item

   ; We allow %x09 HT in prose, but not in strings
   blank           = %x09 / %x0A / %x0D / %x20
   non-slash       = blank / %x21-2e / %x30-D7FF / %xE000-10FFFF
   non-lf          = %x09 / %x0D / %x20-D7FF / %xE000-10FFFF
   S               = *blank *(comment *blank)
   comment         = "/" *non-slash "/"
                   / "#" *non-lf %x0A

   ; optional trailing comma (ignored)
   OC              = ["," S]

   ; check semantically that strings are either all text or all bytes
   ; note that there must be at least one string to distinguish
   streamstring    = "(_" S string S *("," S string S) OC ")"
   spec            = ["_" *wordchar]

   double-quoted   = unescaped

Bormann                   Expires 4 August 2024                [Page 20]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

                   / "'"
                   / "\" DQUOTE
                   / "\" escapable

   single-quoted   = unescaped
                   / DQUOTE
                   / "\" "'"
                   / "\" escapable

   escapable       = %s"b" ; BS backspace U+0008
                   / %s"f" ; FF form feed U+000C
                   / %s"n" ; LF line feed U+000A
                   / %s"r" ; CR carriage return U+000D
                   / %s"t" ; HT horizontal tab U+0009
                   / "/"   ; / slash (solidus) U+002F (JSON!)
                   / "\"   ; \ backslash (reverse solidus) U+005C
                   / (%s"u" hexchar) ;  uXXXX      U+XXXX

   hexchar         = "{" (1*"0" [ hexscalar ] / hexscalar) "}"
                   / non-surrogate
                   / (high-surrogate "\" %s"u" low-surrogate)
   non-surrogate   = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG)
                   / ("D" ODIGIT 2HEXDIG )
   high-surrogate  = "D" ("8"/"9"/"A"/"B") 2HEXDIG
   low-surrogate   = "D" ("C"/"D"/"E"/"F") 2HEXDIG
   hexscalar       = "10" 4HEXDIG / HEXDIG1 4HEXDIG
                   / non-surrogate / 1*3HEXDIG

   ; Note that no other C0 characters are allowed, including %x09 HT
   unescaped       = %x0A ; new line
                   / %x0D ; carriage return -- ignored on input
                   / %x20-21
                        ; omit 0x22 "
                   / %x23-26
                        ; omit 0x27 '
                   / %x28-5B
                        ; omit 0x5C \
                   / %x5D-D7FF ; skip surrogate code points
                   / %xE000-10FFFF

   DQUOTE          = %x22    ; " double quote
   DIGIT           = %x30-39 ; 0-9
   DIGIT1          = %x31-39 ; 1-9
   ODIGIT          = %x30-37 ; 0-7
   BDIGIT          = %x30-31 ; 0-1
   HEXDIG          = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
   HEXDIG1         = DIGIT1 / "A" / "B" / "C" / "D" / "E" / "F"
   ; Note: double-quoted strings as in "A" are case-insensitive in ABNF

Bormann                   Expires 4 August 2024                [Page 21]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   lcalpha         = %x61-7A ; a-z
   lcalnum         = lcalpha / DIGIT
   ucalpha         = %x41-5A ; A-Z
   ucalnum         = ucalpha / DIGIT
   wordchar        = "_" / lcalnum / ucalpha ; [_a-z0-9A-Z]

                                  Figure 1

   While an ABNF grammar defines the set of character strings that are
   considered to be valid EDN by this ABNF, the mapping of these
   character strings into the generic data model of CBOR is not always
   obvious.

   The following additional items should help in the interpretation:

   *  decnumber stands for an integer in the usual decimal notation,
      unless at least one of the optional parts starting with "." and
      "e" are present, in which case it stands for a floating point
      value in the usual decimal notation.  Note that the grammar now
      allows 3. for 3.0 and .3 for 0.3 (also for hexadecimal floating
      point below); implementers are advised that some platform numeric
      parsers accept only a subset of the floating point syntax in this
      document and may require some preprocessing to use here.

   *  basenumber stands for an integer in the usual base 16/hexadecimal
      ("0x"), base 8/octal ("0o"), or base 2/binary ("0b") notation,
      unless the optional part containing a "p" is present, in which
      case it stands for a floating point number in the usual
      hexadecimal notation (which uses a mantissa in hexadecimal and an
      exponent in decimal notation, see Section 5.12.3 of [IEEE754],
      Section 6.4.4.2 of [C], or Section 5.13.4 of [Cplusplus];
      floating-suffix/floating-point-suffix from the latter two is not
      used here).

   *  spec stands for an encoding indicator.  As per Section 8.1 of RFC
      8949 [STD94]:

      -  an underscore _ on its own stands for indefinite length
         encoding (ai=31, only available behind the opening brace/
         bracket for map and array: strings have a special syntax
         streamstring for indefinite length encoding except for the
         special cases ''_ and ""_), and

      -  _0 to _3 stand for ai=24 to ai=27, respectively.

      Surprisingly, Section 8.1 of RFC 8949 [STD94] does not address
      ai=0 to ai=23 — the assumption seems to be that preferred
      serialization (Section 4.1 of RFC 8949 [STD94]) will be used when

Bormann                   Expires 4 August 2024                [Page 22]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

      converting CBOR diagnostic notation to an encoded CBOR data item,
      so leaving out the encoding indicator for a data item with a
      preferred serialization will implicitly use ai=0 to ai=23 if that
      is possible.  The present specification allows to make this
      explicit:

      -  _i ("immediate") stands for encoding with ai=0 to ai=23.

      While no pressing use for further values for encoding indicators
      comes to mind, this is an extension point for EDN; Section 4.2
      defines a registry for additional values.

   *  string and the rules preceding it in the same block realize both
      the representation of strings that are split up into multiple
      chunks (Section G.4 of RFC 8949 [STD94]) and the use of ellipses
      to represent elisions (Section 3.2).  The semantic processing of
      these rules is relatively complex:

      -  A single ... is a general ellipsis, which can stand for any
         data item.

      -  An ellipsis can be surrounded (on one or both sides) by string
         chunks, the result is a CBOR tag number CPA888 that contains an
         array with joined together spans of such chunks plus the
         ellipses represented by 888(null).

      -  A simple sequence of string chunks is simply joined together.
         In both cases of joining strings, the rules of Section G.4 of
         RFC 8949 [STD94] need to be followed; in particular, if a text
         string results from the joining operation, that result needs to
         be valid UTF-8.

      -  Some of the strings may be app-strings.  If the type of the
         app-string is an actual string, joining of chunked strings
         occurs as with directly notated strings; otherwise the
         occurrence of more than one app-string or an app-string
         together with a directly notated string cannot be processed.

A.2.  ABNF Definitions for app-string Content

   This appendix provides ABNF definitions for application-oriented
   extension literals defined in [STD94] and in this specification.
   These grammars describe the _decoded_ content of the sqstr components
   that combine with the application-extension identifiers to form
   application-oriented extension literals.  Each of these may make use
   of rules defined in Figure 1.

Bormann                   Expires 4 August 2024                [Page 23]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

A.2.1.  h: ABNF Definition of Hexadecimal representation of a byte
        string

   The syntax of the content of byte strings represented in hex, such as
   h'', h'0815', or h'/head/ 63 /contents/ 66 6f 6f' (another
   representation of << "foo" >>), is described by the ABNF in Figure 2.
   This syntax accommodates both lower case and upper case hex digits,
   as well as blank space (including comments) around each hex digit.

   app-string-h    = S *(HEXDIG S HEXDIG S / ellipsis S)
                     ["#" *non-lf]
   ellipsis        = 3*"."
   HEXDIG          = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
   DIGIT           = %x30-39 ; 0-9
   blank           = %x09 / %x0A / %x0D / %x20
   non-slash       = blank / %x21-2e / %x30-10FFFF
   non-lf          = %x09 / %x0D / %x20-D7FF / %xE000-10FFFF
   S               = *blank *(comment *blank )
   comment         = "/" *non-slash "/"
                   / "#" *non-lf %x0A

     Figure 2: ABNF Definition of Hexadecimal Representation of a Byte
                                   String

A.2.2.  b64: ABNF Definition of Base64 representation of a byte string

   The syntax of the content of byte strings represented in base64 is
   described by the ABNF in Figure 2.

   This syntax allows both the classic (Section 4 of [RFC4648]) and the
   URL-safe (Section 5 of [RFC4648]) alphabet to be used.  It
   accommodates, but does not require base64 padding.  Note that
   inclusion of classic base64 makes it impossible to have in-line
   comments in b64, as "/" is valid base64-classic.

   app-string-b64  = B *(4(b64dig B))
                     [b64dig B b64dig B ["=" B "=" / b64dig B ["="]] B]
                     ["#" *inon-lf]
   b64dig          = ALPHA / DIGIT / "-" / "_" / "+" / "/"
   B               = *iblank *(icomment *iblank)
   iblank          = %x0A / %x20  ; Not HT or CR (gone)
   icomment        = "#" *inon-lf %x0A
   inon-lf         = %x20-D7FF / %xE000-10FFFF
   ALPHA           = %x41-5a / %x61-7a
   DIGIT           = %x30-39

    Figure 3: ABNF definition of Base64 Representation of a Byte String

Bormann                   Expires 4 August 2024                [Page 24]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

A.2.3.  dt: ABNF Definition of RFC 3339 Representation of a Date/Time

   The syntax of the content of dt literals can be described by the ABNF
   for date-time from [RFC3339] as summarized in Section 3 of [RFC9165]:

   app-string-dt   = date-time

   date-fullyear   = 4DIGIT
   date-month      = 2DIGIT  ; 01-12
   date-mday       = 2DIGIT  ; 01-28, 01-29, 01-30, 01-31 based on
                             ; month/year
   time-hour       = 2DIGIT  ; 00-23
   time-minute     = 2DIGIT  ; 00-59
   time-second     = 2DIGIT  ; 00-58, 00-59, 00-60 based on leap sec
                             ; rules
   time-secfrac    = "." 1*DIGIT
   time-numoffset  = ("+" / "-") time-hour ":" time-minute
   time-offset     = "Z" / time-numoffset

   partial-time    = time-hour ":" time-minute ":" time-second
                     [time-secfrac]
   full-date       = date-fullyear "-" date-month "-" date-mday
   full-time       = partial-time time-offset

   date-time       = full-date "T" full-time
   DIGIT           =  %x30-39 ; 0-9

     Figure 4: ABNF Definition of RFC3339 Representation of a Date/Time

A.2.4.  ip: ABNF Definition of Textual Representation of an IP Address

   The syntax of the content of ip literals can be described by the ABNF
   for IPv4address and IPv6address in Section 3.2.2 of [RFC3986], as
   included in slightly updated form in Figure 5.

Bormann                   Expires 4 August 2024                [Page 25]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   app-string-ip = IPaddress ["/" uint]

   IPaddress     = IPv4address
                 / IPv6address

   ; ABNF from RFC 3986, re-arranged for PEG compatibility:

   IPv6address   =                            6( h16 ":" ) ls32
                 /                       "::" 5( h16 ":" ) ls32
                 / [ h16               ] "::" 4( h16 ":" ) ls32
                 / [ h16 *1( ":" h16 ) ] "::" 3( h16 ":" ) ls32
                 / [ h16 *2( ":" h16 ) ] "::" 2( h16 ":" ) ls32
                 / [ h16 *3( ":" h16 ) ] "::"    h16 ":"   ls32
                 / [ h16 *4( ":" h16 ) ] "::"              ls32
                 / [ h16 *5( ":" h16 ) ] "::"              h16
                 / [ h16 *6( ":" h16 ) ] "::"

   h16           = 1*4HEXDIG
   ls32          = ( h16 ":" h16 ) / IPv4address
   IPv4address   = dec-octet "." dec-octet "." dec-octet "." dec-octet
   dec-octet     = "25" %x30-35         ; 250-255
                 / "2" %x30-34 DIGIT    ; 200-249
                 / "1" 2DIGIT           ; 100-199
                 / %x31-39 DIGIT        ; 10-99
                 / DIGIT                ; 0-9

   HEXDIG        = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
   DIGIT         = %x30-39 ; 0-9
   DIGIT1        = %x31-39 ; 1-9
   uint          = "0" / DIGIT1 *DIGIT

    Figure 5: ABNF Definition of Textual Representation of an IP Address

Appendix B.  EDN and CDDL

   EDN was designed as a language to provide a human-readable
   representation of an instance, i.e., a single CBOR data item or CBOR
   sequence.  CDDL was designed as a language to describe an (often
   large) set of such instances (which itself constitutes a language),
   in the form of a _data definition_ or _grammar_ (or sometimes called
   _schema_).

   The two languages share some similarities, not the least because they
   have mutually inspired each other.  But they have very different
   roots:

   *  EDN syntax is an extension to JSON syntax [STD90].  (Any
      (interoperable) JSON text is also valid EDN.)

Bormann                   Expires 4 August 2024                [Page 26]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

   *  CDDL syntax is inspired by ABNF's syntax [STD68].

   For engineers that are using both EDN and CDDL, it is easy to write
   "CDDLisms" or "EDNisms" into their drafts that are meant to be in the
   other language.  (This is one more of the many motivations to always
   validate formal language instances with tools.)

   Important differences include:

   *  Comment syntax.  CDDL inherits ABNF's semicolon-delimited end of
      line characters, while EDN finds nothing in JSON that could be
      inherited here.  Inspired by JavaScript, EDN simplifies
      JavaScript's copy of the original C comment syntax to be delimited
      by single slashes (where line ends are not of interest); it also
      adds end-of-line comments starting with #.

      EDN:
         { / alg / 1: -7 / ECDSA 256 / }
         ,
         { 1:   # alg
             -7 # ECDSA 256
         }
      CDDL:  ? 1 => int / tstr, ; algorithm identifier

   *  Syntax for tags.  CDDL's tag syntax is part of the system for
      referring to CBOR's fundamentals (the major type 6, in this case)
      and (with [I-D.ietf-cbor-update-8610-grammar]) allows specifying
      the actual tag number separately, while EDN's tag syntax is a
      simple decimal number and a pair of parentheses.

      EDN:  98(['', {}, /rest elided here: …/])

      CDDL:  COSE_Sign_Tagged = #6.98(COSE_Sign)

   *  Separator character.  Like JSON, EDN requires commas as separators
      between array elements and map members (EDN also allows, but does
      not require, a trailing comma before the closing bracket/brace,
      enabling an easier to maintain "terminator" style of their use).
      CDDL's comma separators in these contexts (CDDL groups) are
      entirely optional (and actually are terminators, which together
      with their optionality allows them to be used like separators as
      well, or even not at all).

   *  Embedded CBOR.  EDN has a special syntax to describe the content
      of byte strings that are encoded CBOR data items.  CDDL can
      specify these with a control operator, which looks very different.

      EDN:  98([/h'a10126'/ << {/alg/ 1: -7 /ECDSA 256/ } >>, /…/])

Bormann                   Expires 4 August 2024                [Page 27]
Internet-Draft         CBOR EDN: Literals and ABNF         February 2024

      CDDL:  serialized_map = bytes .cbor header_map

Acknowledgements

   The concept of application-oriented extensions to diagnostic
   notation, as well as the definition for the "dt" extension, were
   inspired by the CoRAL work by Klaus Hartke.

Author's Address

   Carsten Bormann
   Universität Bremen TZI
   Postfach 330440
   D-28359 Bremen
   Germany
   Phone: +49-421-218-63921
   Email: cabo@tzi.org

Bormann                   Expires 4 August 2024                [Page 28]