Javascript disabled? Like other modern websites, the IETF Datatracker relies on Javascript. Please enable Javascript for full functionality.
Main logging schema for qlog
draft-ietf-quic-qlog-main-schema-07

Versions:
The information below is for an old version of the document.
Document	Type	This is an older version of an Internet-Draft whose latest revision state is "Active".
	Authors	Robin Marx , Luca Niccolini , Marten Seemann , Lucas Pardue
	Last updated	2024-02-28 (Latest revision 2023-10-23)
	Replaces	draft-marx-qlog-main-schema
	RFC stream	Internet Engineering Task Force (IETF)
	Formats	txt html xml htmlized pdf bibtex bibxml
	Reviews	SECDIR Early review (of -05) by Dan Harkins Has nits
	Additional resources	Mailing list discussion
Stream	WG state	WG Document
	Associated WG milestone	Qlog documents to IESG
	Document shepherd	(None)
IESG	IESG state	I-D Exists
	Consensus boilerplate	Unknown
	Telechat date	(None)
	Responsible AD	(None)
	Send notices to	(None)
Email authors Email WG IPR References Referenced by Nits Search email archive
draft-ietf-quic-qlog-main-schema-07
QUIC                                                        R. Marx, Ed.
Internet-Draft                                                    Akamai
Intended status: Standards Track                       L. Niccolini, Ed.
Expires: 25 April 2024                                              Meta
                                                         M. Seemann, Ed.
                                                           Protocol Labs
                                                          L. Pardue, Ed.
                                                              Cloudflare
                                                         23 October 2023

                      Main logging schema for qlog
                  draft-ietf-quic-qlog-main-schema-07

Abstract

   This document defines qlog, an extensible high-level schema for a
   standardized logging format.  It allows easy sharing of data,
   benefitting common debug and analysis methods and tooling.  The high-
   level schema is independent of protocol; separate documents extend
   qlog for protocol-specific data.  The schema is also independent of
   serialization format, allowing logs to be represented in many ways
   such as JSON, CSV, or protobuf.

Note to Readers

      Note to RFC editor: Please remove this section before publication.

   Feedback and discussion are welcome at https://github.com/quicwg/qlog
   (https://github.com/quicwg/qlog).  Readers are advised to refer to
   the "editor's draft" at that URL for an up-to-date version of this
   document.

   Concrete examples of integrations of this schema in various
   programming languages can be found at https://github.com/quiclog/
   qlog/ (https://github.com/quiclog/qlog/).

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

Marx, et al.              Expires 25 April 2024                 [Page 1]
Internet-Draft        Main logging schema for qlog          October 2023

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 25 April 2024.

Copyright Notice

   Copyright (c) 2023 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
     1.1.  Notational Conventions  . . . . . . . . . . . . . . . . .   4
       1.1.1.  Schema definition . . . . . . . . . . . . . . . . . .   5
       1.1.2.  Serialization examples  . . . . . . . . . . . . . . .   6
   2.  Design goals  . . . . . . . . . . . . . . . . . . . . . . . .   6
   3.  QlogFile schema . . . . . . . . . . . . . . . . . . . . . . .   7
     3.1.  Traces  . . . . . . . . . . . . . . . . . . . . . . . . .   8
     3.2.  Trace . . . . . . . . . . . . . . . . . . . . . . . . . .   8
     3.3.  TraceError  . . . . . . . . . . . . . . . . . . . . . . .   9
   4.  QlogFileSeq schema  . . . . . . . . . . . . . . . . . . . . .  10
     4.1.  TraceSeq  . . . . . . . . . . . . . . . . . . . . . . . .  11
   5.  VantagePoint  . . . . . . . . . . . . . . . . . . . . . . . .  12
   6.  Events  . . . . . . . . . . . . . . . . . . . . . . . . . . .  13
     6.1.  Timestamps  . . . . . . . . . . . . . . . . . . . . . . .  14
     6.2.  Names . . . . . . . . . . . . . . . . . . . . . . . . . .  16
     6.3.  Data  . . . . . . . . . . . . . . . . . . . . . . . . . .  16
     6.4.  ProtocolType  . . . . . . . . . . . . . . . . . . . . . .  18
     6.5.  Triggers  . . . . . . . . . . . . . . . . . . . . . . . .  19
     6.6.  Grouping  . . . . . . . . . . . . . . . . . . . . . . . .  19
     6.7.  SystemInformation . . . . . . . . . . . . . . . . . . . .  21
     6.8.  CommonFields  . . . . . . . . . . . . . . . . . . . . . .  21
   7.  Raw packet and frame information  . . . . . . . . . . . . . .  23
   8.  Common events and data classes  . . . . . . . . . . . . . . .  24
     8.1.  Generic events  . . . . . . . . . . . . . . . . . . . . .  24
       8.1.1.  error . . . . . . . . . . . . . . . . . . . . . . . .  25

Marx, et al.              Expires 25 April 2024                 [Page 2]
Internet-Draft        Main logging schema for qlog          October 2023

       8.1.2.  warning . . . . . . . . . . . . . . . . . . . . . . .  25
       8.1.3.  info  . . . . . . . . . . . . . . . . . . . . . . . .  25
       8.1.4.  debug . . . . . . . . . . . . . . . . . . . . . . . .  26
       8.1.5.  verbose . . . . . . . . . . . . . . . . . . . . . . .  26
     8.2.  Simulation events . . . . . . . . . . . . . . . . . . . .  26
       8.2.1.  scenario  . . . . . . . . . . . . . . . . . . . . . .  27
       8.2.2.  marker  . . . . . . . . . . . . . . . . . . . . . . .  27
   9.  Event definition guidelines . . . . . . . . . . . . . . . . .  27
     9.1.  Event design  . . . . . . . . . . . . . . . . . . . . . .  28
     9.2.  Event importance indicators . . . . . . . . . . . . . . .  28
     9.3.  Custom fields\  . . . . . . . . . . . . . . . . . . . . .  29
   10. Serializing qlog  . . . . . . . . . . . . . . . . . . . . . .  30
     10.1.  qlog to JSON mapping . . . . . . . . . . . . . . . . . .  30
       10.1.1.  I-JSON . . . . . . . . . . . . . . . . . . . . . . .  31
       10.1.2.  Truncated values . . . . . . . . . . . . . . . . . .  32
     10.2.  qlog to JSON Text Sequences mapping  . . . . . . . . . .  33
       10.2.1.  Supporting JSON Text Sequences in tooling  . . . . .  34
     10.3.  Other optimized formatting options . . . . . . . . . . .  34
       10.3.1.  Data structure optimizations . . . . . . . . . . . .  35
       10.3.2.  Compression  . . . . . . . . . . . . . . . . . . . .  36
       10.3.3.  Binary formats . . . . . . . . . . . . . . . . . . .  37
       10.3.4.  Overview and summary . . . . . . . . . . . . . . . .  38
     10.4.  Conversion between formats . . . . . . . . . . . . . . .  39
   11. Methods of access and generation  . . . . . . . . . . . . . .  40
     11.1.  Set file output destination via an environment
            variable . . . . . . . . . . . . . . . . . . . . . . . .  40
   12. Tooling requirements  . . . . . . . . . . . . . . . . . . . .  42
   13. Security and privacy considerations . . . . . . . . . . . . .  42
   14. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  45
   15. References  . . . . . . . . . . . . . . . . . . . . . . . . .  45
     15.1.  Normative References . . . . . . . . . . . . . . . . . .  45
     15.2.  Informative References . . . . . . . . . . . . . . . . .  46
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  47
   Change Log  . . . . . . . . . . . . . . . . . . . . . . . . . . .  47
     Since draft-ietf-quic-qlog-main-schema-06:  . . . . . . . . . .  47
     Since draft-ietf-quic-qlog-main-schema-05:  . . . . . . . . . .  47
     Since draft-ietf-quic-qlog-main-schema-04:  . . . . . . . . . .  48
     Since draft-ietf-quic-qlog-main-schema-03:  . . . . . . . . . .  48
     Since draft-ietf-quic-qlog-main-schema-02:  . . . . . . . . . .  48
     Since draft-ietf-quic-qlog-main-schema-01:  . . . . . . . . . .  48
     Since draft-ietf-quic-qlog-main-schema-00:  . . . . . . . . . .  48
     Since draft-marx-qlog-main-schema-draft-02: . . . . . . . . . .  48
     Since draft-marx-qlog-main-schema-01: . . . . . . . . . . . . .  48
     Since draft-marx-qlog-main-schema-00: . . . . . . . . . . . . .  49
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  49

Marx, et al.              Expires 25 April 2024                 [Page 3]
Internet-Draft        Main logging schema for qlog          October 2023

1.  Introduction

   Endpoint logging is a useful strategy for capturing and understanding
   how applications using network protocols are behaving, particularly
   where protocols have an encrypted wire image that restricts
   observers' ability to see what is happening.

   Many applications implement logging using a custom, non-standard
   logging format.  This has an effect on the tools and methods that are
   used to analyze the logs, for example to perform root cause analysis
   of an interoperability failure between distinct implementations.  A
   lack of a common format impedes the development of common tooling
   that can be used by all parties that have access to logs.

   This document defines qlog, an extensible high-level schema and
   harness that provides a shareable, aggregatable and structured
   logging format.  This high-level schema is independent of protocol,
   with logging entries for specific protocols and use cases being
   defined in other documents (see for example [QLOG-QUIC] for QUIC and
   [QLOG-H3] for HTTP/3 and QPACK-related event definitions).

   The goal of this high-level schema is to provide amenities and
   default characteristics that each logging file should contain (or
   should be able to contain), such that generic and reusable toolsets
   can be created that can deal with logs from a variety of different
   protocols and use cases.

   As such, qlog provides versioning, metadata inclusion, log
   aggregation, event grouping and log file size reduction techniques.

   The qlog schema can be serialized in many ways (e.g., JSON, CBOR,
   protobuf, etc).  This document describes only how to employ [JSON],
   its subset [I-JSON], and its streamable derivative
   [JSON-Text-Sequences].

1.1.  Notational Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

Marx, et al.              Expires 25 April 2024                 [Page 4]
Internet-Draft        Main logging schema for qlog          October 2023

1.1.1.  Schema definition

   To define events and data structures, all qlog documents use the
   Concise Data Definition Language [CDDL].  This document uses the
   basic syntax, the specific text, uint, float32, float64, bool, and
   any types, as well as the .default, .size, and .regexp control
   operators, the ~ unwrapping operator, and the $ extension point
   syntax from [CDDL].

   Additionally, this document defines the following custom types for
   clarity:

   ; CDDL's uint is defined as being 64-bit in size
   ; but for many protocol fields it is better to be restrictive
   ; and explicit
   uint8 = uint .size 1
   uint16 = uint .size 2
   uint32 = uint .size 4
   uint64 = uint .size 8

   ; an even-length lowercase string of hexadecimally encoded bytes
   ; examples: 82dc, 027339, 4cdbfd9bf0
   ; this is needed because the default CDDL binary string (bytes/bstr)
   ; is only CBOR and not JSON compatible
   hexstring = text .regexp "([0-9a-f]{2})*"

                 Figure 1: Additional CDDL type definitions

   All timestamps and time-related values (e.g., offsets) in qlog are
   logged as float64 in the millisecond resolution.

   Other qlog documents can define their own CDDL-compatible (struct)
   types (e.g., separately for each Packet type that a protocol
   supports).

      Note to RFC editor: Please remove the following text in this
      section before publication.

   The main general CDDL syntax conventions in this document a reader
   should be aware of for easy reading comprehension are:

   *  ? obj : this object is optional

   *  TypeName1 / TypeName2 : a union of these two types (object can be
      either type 1 OR type 2)

   *  obj: TypeName : this object has this concrete type

Marx, et al.              Expires 25 April 2024                 [Page 5]
Internet-Draft        Main logging schema for qlog          October 2023

   *  obj: [* TypeName] : this object is an array of this type with
      minimum size of 0 elements

   *  obj: [+ TypeName] : this object is an array of this type with
      minimum size of 1 element

   *  TypeName = ... : defines a new type

   *  EnumName = "entry1" / "entry2" / entry3 / ...: defines an enum

   *  StructName = { ... } : defines a new struct type

   *  ; : single-line comment

   *  * text => any : special syntax to indicate 0 or more fields that
      have a string key that maps to any value.  Used to indicate a
      generic JSON object.

   All timestamps and time-related values (e.g., offsets) in qlog are
   logged as float64 in the millisecond resolution.

   Other qlog documents can define their own CDDL-compatible (struct)
   types (e.g., separately for each Packet type that a protocol
   supports).

1.1.2.  Serialization examples

   Serialization examples in this document use JSON ([JSON]) unless
   otherwise indicated.

2.  Design goals

   The main tenets for the qlog schema design are:

   *  Streamable, event-based logging

   *  A flexible format that can reduce log producer overhead, at the
      cost of increased complexity for consumers (e.g. tools)

   *  Extensible and pragmatic

   *  Aggregation and transformation friendly (e.g., the top-level
      element for the non-streaming format is a container for individual
      traces, group_ids can be used to tag events to a particular
      context)

   *  Metadata is stored together with event data

Marx, et al.              Expires 25 April 2024                 [Page 6]
Internet-Draft        Main logging schema for qlog          October 2023

3.  QlogFile schema

   A qlog using the QlogFile schema can contain several individual
   traces and logs from multiple vantage points that are in some way
   related.  The top-level element in this schema defines only a small
   set of "header" fields and an array of component traces, defined in
   Figure 2 as:

   QlogFile = {
       qlog_version: text
       ? qlog_format: text .default "JSON"
       ? title: text
       ? description: text
       ? traces: [+ Trace /
                    TraceError]
   }

                       Figure 2: QlogFile definition

   The required "qlog_version" field MUST have the value "0.4".

   The optional "qlog_format" field indicates the serialization format.
   Its value MUST either be one of the options defined in this document
   (i.e., Section 10) or the field MUST be omitted entirely.  When the
   field is omitted the default value of "JSON" applies.

   The optional "title" and "description" fields provide additional
   free-text information about the file.

   The optional "traces" field contains an array of qlog traces
   (Section 3.2), each of which contain metadata and an array of qlog
   events (Section 6).

   In order to make it easier to parse and identify qlog files and their
   serialization format, the "qlog_version" and "qlog_format" fields and
   their values SHOULD be in the first 256 characters/bytes of the
   resulting log file.

   Where a qlog file is serialized to a JSON format, one of the
   downsides is that it is inherently a non-streamable format.  Put
   differently, it is not possible to simply append new qlog events to a
   log file without "closing" this file at the end by appending "]}]}".
   Without these closing tags, most JSON parsers will be unable to parse
   the file entirely.  The alternative QlogFileSeq (Section 4) is better
   suited to streaming.

   JSON serialization example:

Marx, et al.              Expires 25 April 2024                 [Page 7]
Internet-Draft        Main logging schema for qlog          October 2023

   {
       "qlog_version": "0.4",
       "qlog_format": "JSON",
       "title": "Name of this particular qlog file (short)",
       "description": "Description for this group of traces (long)",
       "traces": [...]
   }

                         Figure 3: QlogFile example

3.1.  Traces

   It can be advantageous to group several related qlog traces together
   in a single file.  For example, it is possible to simultaneously
   perform logging on the client, on the server, and on a single point
   on their common network path.  For analysis, it is useful to
   aggregate these three individual traces together into a single file,
   so it can be uniquely stored, transferred, and annotated.

   The QlogFile "traces" field is an array that contains a list of
   individual qlog traces.  When capturing a qlog at a vantage point, it
   is expected that the traces field contains a single entry.  Files can
   be aggregated, for example as part of a post-processing operation, by
   copying the traces in component to files into the combined "traces"
   array of a new, aggregated qlog file.

3.2.  Trace

   The exact conceptual definition of a Trace can be fluid.  For
   example, a trace could contain all events for a single connection,
   for a single endpoint, for a single measurement interval, for a
   single protocol, etc.  In the normal use case however, a trace is a
   log of a single data flow collected at a single location or vantage
   point.  For example, for QUIC, a single trace only contains events
   for a single logical QUIC connection for either the client or the
   server.

   A Trace contains some metadata in addition to qlog events, defined in
   Figure 4 as:

   Trace = {
       ? title: text
       ? description: text
       ? common_fields: CommonFields
       ? vantage_point: VantagePoint
       events: [* Event]
   }

Marx, et al.              Expires 25 April 2024                 [Page 8]
Internet-Draft        Main logging schema for qlog          October 2023

                         Figure 4: Trace definition

   The optional "title" and "description" fields provide additional
   free-text information about the trace.

   The optional "common_fields" field is described in Section 6.8.

   The optional "vantage_point" field is described in Section 5.

   The semantics and context of the trace can mainly be deduced from the
   entries in the "common_fields" list and "vantage_point" field.

   JSON serialization example:

   {
       "title": "Name of this particular trace (short)",
       "description": "Description for this trace (long)",
       "common_fields": {
           "ODCID": "abcde1234",
           "time_format": "absolute"
       },
       "vantage_point": {
           "name": "backend-67",
           "type": "server"
       },
       "events": [...]
   }

                          Figure 5: Trace example

3.3.  TraceError

   A TraceError indicates that an attempt to find/convert a file for
   inclusion in the aggregated qlog was made, but there was an error
   during the process.  Rather than silently dropping the erroneous
   file, it can be explicitly included in the qlog file as an entry in
   the "traces" array, defined in Figure 6 as:

   TraceError = {
       error_description: text

       ; the original URI used for attempted find of the file
       ? uri: text
       ? vantage_point: VantagePoint
   }

                      Figure 6: TraceError definition

Marx, et al.              Expires 25 April 2024                 [Page 9]
Internet-Draft        Main logging schema for qlog          October 2023

   JSON serialization example:

   {
       "error_description": "File could not be found",
       "uri": "/srv/traces/today/latest.qlog",
       "vantage_point": { type: "server" }
   }

                        Figure 7: TraceError example

   Note that another way to combine events of different traces in a
   single qlog file is through the use of the "group_id" field,
   discussed in Section 6.6.

4.  QlogFileSeq schema

   A qlog file using the QlogFileSeq schema can be serialized to a
   streamable JSON format called JSON Text Sequences (JSON-SEQ)
   ([RFC7464]).  The top-level element in this schema defines only a
   small set of "header" fields and an array of component traces,
   defined in Figure 2 as:

   QlogFileSeq = {
       qlog_format: "JSON-SEQ"
       qlog_version: text
       ? title: text
       ? description: text
       trace: TraceSeq
   }

                      Figure 8: QlogFileSeq definition

   The required "qlog_format" field MUST have the value "JSON-SEQ".

   The required "qlog_version" field MUST have the value "0.4".

   The optional "title" and "description" fields provide additional
   free-text information about the file.

   The optional "trace" field contains a singular trace metadata.  All
   qlog events in the file are related to this trace.

   JSON-SEQ serialization example:

Marx, et al.              Expires 25 April 2024                [Page 10]
Internet-Draft        Main logging schema for qlog          October 2023

   // list of qlog events, serialized in accordance with RFC 7464,
   // starting with a Record Separator character and ending with a
   // newline.
   // For display purposes, Record Separators are rendered as <RS>

   <RS>{
       "qlog_version": "0.4",
       "qlog_format": "JSON-SEQ",
       "title": "Name of JSON Text Sequence qlog file (short)",
       "description": "Description for this trace file (long)",
       "trace": {
         "common_fields": {
           "protocol_type": ["QUIC","HTTP3"],
           "group_id":"127ecc830d98f9d54a42c4f0842aa87e181a",
           "time_format":"relative",
           "reference_time": 1553986553572
         },
         "vantage_point": {
           "name":"backend-67",
           "type":"server"
         }
       }
   }
   <RS>{"time": 2, "name": "quic:parameters_set", "data": { ... } }
   <RS>{"time": 7, "name": "quic:packet_sent", "data": { ... } }
   ...

                        Figure 9: Top-level element

   For further information about serialization, see Section 10.2.

4.1.  TraceSeq

   TraceSeq is used with QlogFileSeq.  It is conceptually similar to a
   Trace, with the exception that qlog events are not contained within
   it, but rather appended after it in a QlogFileSeq.

   TraceSeq = {
       ? title: text
       ? description: text
       ? common_fields: CommonFields
       ? vantage_point: VantagePoint
   }

                       Figure 10: TraceSeq definition

Marx, et al.              Expires 25 April 2024                [Page 11]
Internet-Draft        Main logging schema for qlog          October 2023

5.  VantagePoint

   A VantagePoint describes the vantage point from which a trace
   originates, defined in Figure 11 as:

   VantagePoint = {
       ? name: text
       type: VantagePointType
       ? flow: VantagePointType
   }

   ; client = endpoint which initiates the connection
   ; server = endpoint which accepts the connection
   ; network = observer in between client and server
   VantagePointType = "client" /
                      "server" /
                      "network" /
                      "unknown"

                     Figure 11: VantagePoint definition

   JSON serialization examples:

   {
       "name": "aioquic client",
       "type": "client"
   }

   {
       "name": "wireshark trace",
       "type": "network",
       "flow": "client"
   }

                      Figure 12: VantagePoint example

   The flow field is only required if the type is "network" (for
   example, the trace is generated from a packet capture).  It is used
   to disambiguate events like "packet sent" and "packet received".
   This is indicated explicitly because for multiple reasons (e.g.,
   privacy) data from which the flow direction can be otherwise inferred
   (e.g., IP addresses) might not be present in the logs.

Marx, et al.              Expires 25 April 2024                [Page 12]
Internet-Draft        Main logging schema for qlog          October 2023

   Meaning of the different values for the flow field: * "client"
   indicates that this vantage point follows client data flow semantics
   (a "packet sent" event goes in the direction of the server).  *
   "server" indicates that this vantage point follow server data flow
   semantics (a "packet sent" event goes in the direction of the
   client).  * "unknown" indicates that the flow's direction is unknown.

   Depending on the context, tools confronted with "unknown" values in
   the vantage_point can either try to heuristically infer the semantics
   from protocol-level domain knowledge (e.g., in QUIC, the client
   always sends the first packet) or give the user the option to switch
   between client and server perspectives manually.

6.  Events

   A qlog event is specified as a generic object with a number of member
   fields and their associated data.  Depending on the protocol and use
   case, the exact member field names and their formats can differ
   across implementations.  This section lists the main, pre-defined and
   reserved field names with specific semantics and expected
   corresponding value formats.

   An Event is defined in Figure 13 as:

   Event = {
       time: float64
       name: text
       data: $ProtocolEventBody
       ? time_format: TimeFormat
       ? protocol_type: ProtocolType
       ? group_id: GroupID
       ? system_info: SystemInformation

       ; events can contain any amount of custom fields
       * text => any
   }

                        Figure 13: Event definition

   Each qlog event MUST contain the mandatory fields: "time"
   (Section 6.1), "name" (Section 6.2), and "data" (Section 6.3).

   Each qlog event MAY contain the optional fields: "time_format"
   (Section 6.1), "protocol_type" (Section 6.4), "trigger"
   (Section 6.5), and "group_id" (Section 6.6).

Marx, et al.              Expires 25 April 2024                [Page 13]
Internet-Draft        Main logging schema for qlog          October 2023

   Multiple events can appear in a Trace or TraceSeq and they might
   contain fields with identical values.  It is possible to optimize out
   this duplication using "common_fields" (Section 6.8).

   The specific values for each of these fields and their semantics are
   defined in separate documents, depending on protocol or use case.
   For example: event definitions for QUIC, HTTP/3 and QPACK can be
   found in [QLOG-QUIC] and [QLOG-H3].

   Events are intended to be extended with custom fields, therefore they
   MAY contain other fields not defined in this document.  Custom fields
   may be known or unknown to tools.  Tools SHOULD allow for the
   presence of unknown event fields, but their semantics depend on the
   context of the log usage.

   JSON serialization:

   {
       "time": 1553986553572,

       "name": "quic:packet_sent",
       "data": { ... },

       "protocol_type":  ["QUIC","HTTP3"],
       "group_id": "127ecc830d98f9d54a42c4f0842aa87e181a",

       "time_format": "absolute",

       "ODCID": "127ecc830d98f9d54a42c4f0842aa87e181a"
   }

                          Figure 14: Event example

6.1.  Timestamps

   An event's "time" field indicates the timestamp at which the event
   occurred.  Its value is typically the Unix timestamp since the 1970
   epoch (number of milliseconds since midnight UTC, January 1, 1970,
   ignoring leap seconds).  However, qlog supports two more succinct
   timestamps formats to allow reducing file size.  The employed format
   is indicated in the "time_format" field, which allows one of three
   values: "absolute", "delta" or "relative".

   Definition:

   TimeFormat = "absolute" /
                "delta" /
                "relative"

Marx, et al.              Expires 25 April 2024                [Page 14]
Internet-Draft        Main logging schema for qlog          October 2023

                      Figure 15: TimeFormat definition

   *  Absolute: Include the full absolute timestamp with each event.
      This approach uses the largest amount of characters.  This is also
      the default value of the "time_format" field.

   *  Delta: Delta-encode each time value on the previously logged
      value.  The first event in a trace typically logs the full
      absolute timestamp.  This approach uses the least amount of
      characters.

   *  Relative: Specify a full "reference_time" timestamp (typically
      this is done up-front in "common_fields", see Section 6.8) and
      include only relatively-encoded values based on this
      reference_time with each event.  The "reference_time" value is
      typically the first absolute timestamp.  This approach uses a
      medium amount of characters.

   The first option is good for stateless loggers, the second and third
   for stateful loggers.  The third option is generally preferred, since
   it produces smaller files while being easier to reason about.  An
   example for each option can be seen in Figure 16.

   The absolute approach will use:
   1500, 1505, 1522, 1588

   The delta approach will use:
   1500, 5, 17, 66

   The relative approach will:
   - set the reference_time to 1500 in "common_fields"
   - use: 0, 5, 22, 88

        Figure 16: Three different approaches for logging timestamps

   One of these options is typically chosen for the entire trace (put
   differently: each event has the same value for the "time_format"
   field).  Each event MUST include a timestamp in the "time" field.

   Events in each individual trace SHOULD be logged in strictly
   ascending timestamp order (though not necessarily absolute value, for
   the "delta" format).  Tools MAY sort all events on the timestamp
   before processing them, though are not required to (as this could
   impose a significant processing overhead).  This can be a problem
   especially for multi-threaded and/or streaming loggers, who could
   consider using a separate post-processor to order qlog events in time
   if a tool do not provide this feature.

Marx, et al.              Expires 25 April 2024                [Page 15]
Internet-Draft        Main logging schema for qlog          October 2023

   Timestamps do not have to use the UNIX epoch timestamp as their
   reference.  For example for privacy considerations, any initial
   reference timestamps (for example "endpoint uptime in ms" or "time
   since connection start in ms") can be chosen.  Tools SHOULD NOT
   assume the ability to derive the absolute Unix timestamp from qlog
   traces, nor allow on them to relatively order events across two or
   more separate traces (in this case, clock drift should also be taken
   into account).

6.2.  Names

   Events differ mainly in the type of metadata associated with them.
   The "name" field is an identifier that parsers can use to decide how
   to interpret the event metadata contained in the "data" field (see
   Section 6.3).

   Event names indicate a category and type.  The "name" field MUST
   contain a non-empty character sequence representing a category,
   followed by a colon (':'), followed by a non-empty character sequence
   representing a type.

   Category allows a higher-level grouping of events per specific event
   type.  For example for QUIC and HTTP/3, the different categories
   could be "quic", "http", "qpack", and "recovery".  Within these
   categories, the event type provides additional granularity.  For
   example for QUIC and HTTP/3, within the "quic" category, there would
   be "packet_sent" and "packet_received" events.

   JSON serialization example:

   {
       "name": "quic:packet_sent"
   }

      Figure 17: An event with category "quic" and type "packet_sent".

6.3.  Data

   An event's "data" field is a generic object.  It contains the per-
   event metadata and its form and semantics are defined per specific
   sort of event.  For example, data field value definitions for QUIC
   and HTTP/3 can be found in [QLOG-QUIC] and [QLOG-H3].

   This field is defined here as a CDDL extension point (a "socket" or
   "plug") named $ProtocolEventBody.  Other documents MUST properly
   extend this extension point when defining new data field content
   options to enable automated validation of aggregated qlog schemas.

Marx, et al.              Expires 25 April 2024                [Page 16]
Internet-Draft        Main logging schema for qlog          October 2023

   The only common field defined for the data field is the trigger
   field, which is discussed in Section 6.5.

   Definition:

   ; The ProtocolEventBody is any key-value map (e.g., JSON object)
   ; only the optional trigger field is defined in this document
   $ProtocolEventBody /= {
       ? trigger: text
       * text => any
   }
   ; event documents are intended to extend this socket by using:
   ; NewProtocolEvents = EventType1 /
   ;                     EventType2 /
   ;                     ... /
   ;                     EventTypeN
   ; $ProtocolEventBody /= NewProtocolEvents

                  Figure 18: ProtocolEventBody definition

   One purely illustrative example for a QUIC "packet_sent" event is
   shown in Figure 19:

Marx, et al.              Expires 25 April 2024                [Page 17]
Internet-Draft        Main logging schema for qlog          October 2023

   TransportPacketSent = {
       ? packet_size: uint16
       header: PacketHeader
       ? frames:[* QuicFrame]
       ? trigger: "pto_probe" /
                  "retransmit_timeout" /
                  "bandwidth_probe"
   }

   could be serialized as

   {
       "packet_size": 1280,
       "header": {
           "packet_type": "1RTT",
           "packet_number": 123
       },
       "frames": [
           {
               "frame_type": "stream",
               "length": 1000,
               "offset": 456
           },
           {
               "frame_type": "padding"
           }
       ]
   }

    Figure 19: Example of the 'data' field for a QUIC packet_sent event

6.4.  ProtocolType

   An event's "protocol_type" array field indicates to which protocols
   (or protocol "stacks") this event belongs.  This allows a single qlog
   file to aggregate traces of different protocols (e.g., a web server
   offering both TCP+HTTP/2 and QUIC+HTTP/3 connections).

   Definition:

   ProtocolType = [+ text]

                     Figure 20: ProtocolType definition

   For example, QUIC and HTTP/3 events have the "QUIC" and "HTTP3"
   protocol_type entry values, see [QLOG-QUIC] and [QLOG-H3].

Marx, et al.              Expires 25 April 2024                [Page 18]
Internet-Draft        Main logging schema for qlog          October 2023

   Typically however, all events in a single trace are of the same few
   protocols, and this array field is logged once in "common_fields",
   see Section 6.8.

6.5.  Triggers

   Sometimes, additional information is needed in the case where a
   single event can be caused by a variety of other events.  In the
   normal case, the context of the surrounding log messages gives a hint
   as to which of these other events was the cause.  However, in highly-
   parallel and optimized implementations, corresponding log messages
   might separated in time.  Another option is to explicitly indicate
   these "triggers" in a high-level way per-event to get more fine-
   grained information without much additional overhead.

   In qlog, the optional "trigger" field contains a string value
   describing the reason (if any) for this event instance occurring, see
   Section 6.3.  While this "trigger" field could be a property of the
   qlog Event itself, it is instead a property of the "data" field
   instead.  This choice was made because many event types do not
   include a trigger value, and having the field at the Event-level
   would cause overhead in some serializations.  Additional information
   on the trigger can be added in the form of additional member fields
   of the "data" field value, yet this is highly implementation-
   specific, as are the trigger field's string values.

   One purely illustrative example of some potential triggers for QUIC's
   "packet_dropped" event is shown in Figure 21:

   TransportPacketDropped = {
       ? packet_type: PacketType
       ? raw_length: uint16
       ? trigger: "key_unavailable" /
                  "unknown_connection_id" /
                  "decrypt_error" /
                  "unsupported_version"
   }

                         Figure 21: Trigger example

6.6.  Grouping

   As discussed in Section 3.2, a single qlog file can contain several
   traces taken from different vantage points.  However, a single trace
   from one endpoint can also contain events from a variety of sources.
   For example, a server implementation might choose to log events for
   all incoming connections in a single large (streamed) qlog file.  As
   such, a method for splitting up events belonging to separate logical

Marx, et al.              Expires 25 April 2024                [Page 19]
Internet-Draft        Main logging schema for qlog          October 2023

   entities is required.

   The simplest way to perform this splitting is by associating a "group
   id" to each event that indicates to which conceptual "group" each
   event belongs.  A post-processing step can then extract events per
   group.  However, this group identifier can be highly protocol and
   context-specific.  In the example above, the QUIC "Original
   Destination Connection ID" could be used to uniquely identify a
   connection.  As such, they might add a "ODCID" field to each event.
   However, a middlebox logging IP or TCP traffic might rather use four-
   tuples to identify connections, and add a "four_tuple" field.

   As such, to provide consistency and ease of tooling in cross-protocol
   and cross-context setups, qlog instead defines the common "group_id"
   field, which contains a string value.  Implementations are free to
   use their preferred string serialization for this field, so long as
   it contains a unique value per logical group.  Some examples can be
   seen in Figure 23.

   Definition:

   GroupID = text

                       Figure 22: GroupID definition

   JSON serialization example for events grouped by four tuples and QUIC
   connection IDs:

   "events": [
       {
           "time": 1553986553579,
           "protocol_type": ["TCP", "TLS", "HTTP2"],
           "group_id": "ip1=2001:67c:1232:144:9498:6df6:f450:110b,
                      ip2=2001:67c:2b0:1c1::198,port1=59105,port2=80",
           "name": "quic:packet_received",
           "data": { ... }
       },
       {
           "time": 1553986553581,
           "protocol_type": ["QUIC","HTTP3"],
           "group_id": "127ecc830d98f9d54a42c4f0842aa87e181a",
           "name": "quic:packet_sent",
           "data": { ... }
       }
   ]

                         Figure 23: GroupID example

Marx, et al.              Expires 25 April 2024                [Page 20]
Internet-Draft        Main logging schema for qlog          October 2023

   Note that in some contexts (for example a Multipath transport
   protocol) it might make sense to add additional contextual per-event
   fields (for example "path_id"), rather than use the group_id field
   for that purpose.

   Note also that, typically, a single trace only contains events
   belonging to a single logical group (for example, an individual QUIC
   connection).  As such, instead of logging the "group_id" field with
   an identical value for each event instance, this field is typically
   logged once in "common_fields", see Section 6.8.

6.7.  SystemInformation

   The "system_info" field can be used to record system-specific details
   related to an event.  This is useful, for instance, where an
   application splits work across CPUs, processes, or threads and events
   for a single trace occur on potentially different combinations
   thereof.  Each field is optional to support deployment diversity.

   Definition:

   SystemInformation = {
     ? processor_id: uint32
     ? process_id: uint32
     ? thread_id: uint32
   }

6.8.  CommonFields

   As discussed in the previous sections, information for a typical qlog
   event varies in three main fields: "time", "name" and associated
   data.  Additionally, there are also several more advanced fields that
   allow mixing events from different protocols and contexts inside of
   the same trace (for example "protocol_type" and "group_id").  In most
   "normal" use cases however, the values of these advanced fields are
   consistent for each event instance (for example, a single trace
   contains events for a single QUIC connection).

   To reduce file size and making logging easier, qlog uses the
   "common_fields" list to indicate those fields and their values that
   are shared by all events in this component trace.  This prevents
   these fields from being logged for each individual event.  An example
   of this is shown in Figure 24.

Marx, et al.              Expires 25 April 2024                [Page 21]
Internet-Draft        Main logging schema for qlog          October 2023

   JSON serialization with repeated field values
   per-event instance:

   {
       "events": [{
               "group_id": "127ecc830d98f9d54a42c4f0842aa87e181a",
               "protocol_type": ["QUIC","HTTP3"],
               "time_format": "relative",
               "reference_time": 1553986553572,

               "time": 2,
               "name": "quic:packet_received",
               "data": { ... }
           },{
               "group_id": "127ecc830d98f9d54a42c4f0842aa87e181a",
               "protocol_type": ["QUIC","HTTP3"],
               "time_format": "relative",
               "reference_time": 1553986553572,

               "time": 7,
               "name": "http:frame_parsed",
               "data": { ... }
           }
       ]
   }

   JSON serialization with repeated field values instead
   extracted to common_fields:

   {
       "common_fields": {
           "group_id": "127ecc830d98f9d54a42c4f0842aa87e181a",
           "protocol_type": ["QUIC","HTTP3"],
           "time_format": "relative",
           "reference_time": 1553986553572
       },
       "events": [
           {
               "time": 2,
               "name": "quic:packet_received",
               "data": { ... }
           },{
               "time": 7,
               "name": "http:frame_parsed",
               "data": { ... }
           }
       ]
   }

Marx, et al.              Expires 25 April 2024                [Page 22]
Internet-Draft        Main logging schema for qlog          October 2023

                      Figure 24: CommonFields example

   An event's "common_fields" field is a generic dictionary of key-value
   pairs, where the key is always a string and the value can be of any
   type, but is typically also a string or number.  As such, unknown
   entries in this dictionary MUST be disregarded by the user and tools
   (i.e., the presence of an unknown field is explicitly NOT an error).

   The list of default qlog fields that are typically logged in
   common_fields (as opposed to as individual fields per event instance)
   are shown in the listing below:

   Definition:

   CommonFields = {
       ? time_format: TimeFormat
       ? reference_time: float64
       ? protocol_type: ProtocolType
       ? group_id: GroupID
       * text => any
   }

                     Figure 25: CommonFields definition

   Tools MUST be able to deal with these fields being defined either on
   each event individually or combined in common_fields.  Note that if
   at least one event in a trace has a different value for a given
   field, this field MUST NOT be added to common_fields but instead
   defined on each event individually.  Good example of such fields are
   "time" and "data", who are divergent by nature.

7.  Raw packet and frame information

   While qlog is a high-level logging format, it also allows the
   inclusion of most raw wire image information, such as byte lengths
   and byte values.  This is useful when for example investigating or
   tuning packetization behavior or determining encoding/framing
   overheads.  However, these fields are not always necessary, can take
   up considerable space, and can have a considerable privacy and
   security impact (see Section 13).  Where applicable, these fields are
   grouped in a separate, optional, field named "raw" of type RawInfo.
   The exact definition of entities, headers, trailers and payloads
   depend on the protocol used.

   Definition:

Marx, et al.              Expires 25 April 2024                [Page 23]
Internet-Draft        Main logging schema for qlog          October 2023

   RawInfo = {

       ; the full byte length of the entity (e.g., packet or frame),
       ; including possible headers and trailers
       ? length: uint64

       ; the byte length of the entity's payload,
       ; excluding possible headers or trailers
       ? payload_length: uint64

       ; the (potentially truncated) contents of the full entity,
       ; including headers and possibly trailers
       ? data: hexstring
   }

                       Figure 26: RawInfo definition

   The RawInfo:data field can be truncated for privacy or security
   purposes, see Section 10.1.2.  In this case, the length and
   payload_length fields should still indicate the non-truncated lengths
   when used for debugging purposes.

   This document does not specify explicit header_length or
   trailer_length fields.  In protocols without trailers, header_length
   can be calculated by subtracting the payload_length from the length.
   In protocols with trailers (e.g., QUIC's AEAD tag), event definition
   documents SHOULD define how to support header_length calculation.

8.  Common events and data classes

   There are some event types and data classes that are common across
   protocols, applications, and use cases.  This section specifies such
   common definitions.

8.1.  Generic events

   In typical logging setups, users utilize a discrete number of well-
   defined logging categories, levels or severities to log freeform
   (string) data.  This generic events category replicates this approach
   to allow implementations to fully replace their existing text-based
   logging by qlog.  This is done by providing events to log generic
   strings for the typical well-known logging levels (error, warning,
   info, debug, verbose).

   For the events defined below, the "category" is "generic" and their
   "type" is the name of the heading in lowercase (e.g., the "name" of
   the error event is "generic:error").

Marx, et al.              Expires 25 April 2024                [Page 24]
Internet-Draft        Main logging schema for qlog          October 2023

8.1.1.  error

   Importance: Core

   Used to log details of an internal error that might not get reflected
   on the wire.

   Definition:

   GenericError = {
       ? code: uint64
       ? message: text
   }

                     Figure 27: GenericError definition

8.1.2.  warning

   Importance: Base

   Used to log details of an internal warning that might not get
   reflected on the wire.

   Definition:

   GenericWarning = {
       ? code: uint64
       ? message: text
   }

                    Figure 28: GenericWarning definition

8.1.3.  info

   Importance: Extra

   Used mainly for implementations that want to use qlog as their one
   and only logging format but still want to support unstructured string
   messages.

   Definition:

   GenericInfo = {
       message: text
   }

                     Figure 29: GenericInfo definition

Marx, et al.              Expires 25 April 2024                [Page 25]
Internet-Draft        Main logging schema for qlog          October 2023

8.1.4.  debug

   Importance: Extra

   Used mainly for implementations that want to use qlog as their one
   and only logging format but still want to support unstructured string
   messages.

   Definition:

   GenericDebug = {
       message: text
   }

                     Figure 30: GenericDebug definition

8.1.5.  verbose

   Importance: Extra

   Used mainly for implementations that want to use qlog as their one
   and only logging format but still want to support unstructured string
   messages.

   Definition:

   GenericVerbose = {
       message: text
   }

                    Figure 31: GenericVerbose definition

8.2.  Simulation events

   When evaluating a protocol implementation, one typically sets up a
   series of interoperability or benchmarking tests, in which the test
   situations can change over time.  For example, the network bandwidth
   or latency can vary during the test, or the network can be fully
   disable for a short time.  In these setups, it is useful to know when
   exactly these conditions are triggered, to allow for proper
   correlation with other events.

   For the events defined below, the "category" is "simulation" and
   their "type" is the name of the heading in lowercase (e.g., the
   "name" of the scenario event is "simulation:scenario").

Marx, et al.              Expires 25 April 2024                [Page 26]
Internet-Draft        Main logging schema for qlog          October 2023

8.2.1.  scenario

   Importance: Extra

   Used to specify which specific scenario is being tested at this
   particular instance.  This supports, for example, aggregation of
   several simulations into one trace (e.g., split by group_id).

   Definition:

   SimulationScenario = {
       ? name: text
       ? details: {* text => any }
   }

                  Figure 32: SimulationScenario definition

8.2.2.  marker

   Importance: Extra

   Used to indicate when specific emulation conditions are triggered at
   set times (e.g., at 3 seconds in 2% packet loss is introduced, at 10s
   a NAT rebind is triggered).

   Definition:

   SimulationMarker = {
       ? type: text
       ? message: text
   }

                   Figure 33: SimulationMarker definition

9.  Event definition guidelines

   This document defines the main schema for the qlog format together
   with some common events, which on their own do not provide much
   logging utility.  It is expected that logging is extended with
   specific, per-protocol event definitions that specify the name
   (category + type) and data needed for each individual event.
   Examples include the QUIC event definitions [QLOG-QUIC] and HTTP/3
   and QPACK event definitions [QLOG-H3].

   This section defines some basic annotations and concepts that SHOULD
   be used by event definition documents.  Doing so ensures a measure of
   consistency that makes it easier for qlog implementers to support a
   wide variety of protocols.

Marx, et al.              Expires 25 April 2024                [Page 27]
Internet-Draft        Main logging schema for qlog          October 2023

9.1.  Event design

   There are several ways of defining qlog events.  In practice, two
   main types of approach have been observed: a) those that map directly
   to concepts seen in the protocols (e.g., packet_sent) and b) those
   that act as aggregating events that combine data from several
   possible protocol behaviors or code paths into one (e.g.,
   parameters_set).  The latter are typically used as a means to reduce
   the amount of unique event definitions, as reflecting each possible
   protocol event as a separate qlog entity would cause an explosion of
   event types.

   Additionally, logging duplicate data is typically prevented as much
   as possible.  For example, packet header values that remain
   consistent across many packets are split into separate events (for
   example spin_bit_updated or connection_id_updated for QUIC).

   Finally, when logging additional state change events, those state
   changes can often be directly inferred from data on the wire (for
   example flow control limit changes).  As such, if the implementation
   is bug-free and spec-compliant, logging additional events is
   typically avoided.  Exceptions have been made for common events that
   benefit from being easily identifiable or individually logged (for
   example packets_acked).

9.2.  Event importance indicators

   Depending on how events are designed, it may be that several events
   allow the logging of similar or overlapping data.  For example the
   separate QUIC connection_started event overlaps with the more generic
   connection_state_updated.  In these cases, it is not always clear
   which event should be logged or used, and which event should take
   precedence if e.g., both are present and provide conflicting
   information.

   To aid in this decision making, each event SHOULD have an "importance
   indicator" with one of three values, in decreasing order of
   importance and expected usage:

   *  Core

   *  Base

   *  Extra

   The "Core" events are the events that SHOULD be present in all qlog
   files for a given protocol.  These are typically tied to basic packet
   and frame parsing and creation, as well as listing basic internal

Marx, et al.              Expires 25 April 2024                [Page 28]
Internet-Draft        Main logging schema for qlog          October 2023

   metrics.  Tool implementers SHOULD expect and add support for these
   events, though SHOULD NOT expect all Core events to be present in
   each qlog trace.

   The "Base" events add additional debugging options and MAY be present
   in qlog files.  Most of these can be implicitly inferred from data in
   Core events (if those contain all their properties), but for many it
   is better to log the events explicitly as well, making it clearer how
   the implementation behaves.  These events are for example tied to
   passing data around in buffers, to how internal state machines
   change, and used to help show when decisions are actually made based
   on received data.  Tool implementers SHOULD at least add support for
   showing the contents of these events, if they do not handle them
   explicitly.

   The "Extra" events are considered mostly useful for low-level
   debugging of the implementation, rather than the protocol.  They
   allow more fine-grained tracking of internal behavior.  As such, they
   MAY be present in qlog files and tool implementers MAY add support
   for these, but they are not required to.

   Note that in some cases, implementers might not want to log for
   example data content details in the "Core" events due to performance
   or privacy considerations.  In this case, they SHOULD use (a subset
   of) relevant "Base" events instead to ensure usability of the qlog
   output.  As an example, implementations that do not log QUIC
   packet_received events and thus also not which (if any) ACK frames
   the packet contains, SHOULD log packets_acked events instead.

   Finally, for event types whose data (partially) overlap with other
   event types' definitions, where necessary the event definition
   document should include explicit guidance on which to use in specific
   situations.

9.3.  Custom fields\

   Event definition documents are free to define new category and event
   types, top-level fields (e.g., a per-event field indicating its
   privacy properties or path_id in multipath protocols), as well as
   values for the "trigger" property within the "data" field, or other
   member fields of the "data" field, as they see fit.

   They however SHOULD NOT expect non-specialized tools to recognize or
   visualize this custom data.  However, tools SHOULD make an effort to
   visualize even unknown data if possible in the specific tool's
   context.  If they do not, they MUST ignore these unknown fields.

Marx, et al.              Expires 25 April 2024                [Page 29]
Internet-Draft        Main logging schema for qlog          October 2023

10.  Serializing qlog

   This document and other related qlog schema definitions are
   intentionally independent of serialization format.  This means that
   implementers themselves can choose how to represent and serialize
   qlog data practically on disk or on the wire.  Some examples of
   possible formats are JSON, CBOR, CSV, protobuf, flatbuffers, etc.

   All these formats make certain tradeoffs between flexibility and
   efficiency, with textual formats like JSON typically being more
   flexible but also less efficient than binary formats like protocol
   buffers.  The format choice will depend on the practical use case of
   the qlog user.  For example, for use in day to day debugging, a
   plaintext readable (yet relatively large) format like JSON is
   probably preferred.  However, for use in production, a more optimized
   yet restricted format can be better.  In this latter case, it will be
   more difficult to achieve interoperability between qlog
   implementations of various protocol stacks, as some custom or tweaked
   events from one might not be compatible with the format of the other.
   This will also reflect in tooling: not all tools will support all
   formats.

   This being said, the authors prefer JSON as the basis for storing
   qlog, as it retains full flexibility and maximum interoperability.
   Storage overhead can be managed well in practice by employing
   compression.  For this reason, this document details how to
   practically transform qlog schema definitions to [JSON], its subset
   [I-JSON], and its streamable derivative [JSON-Text-Sequences]s.
   Concrete options to bring down JSON size and processing overheads are
   discuseed in Section 10.3.

   As depending on the employed format different deserializers/parsers
   should be used, the "qlog_format" field is used to indicate the
   chosen serialization approach.  This field is always a string, but
   can be made hierarchical by the use of the "." separator between
   entries.  For example, a value of "JSON.optimizationA" can indicate
   that a default JSON format is being used, but that a certain
   optimization of type A was applied to the file as well (see also
   Section 10.3).

10.1.  qlog to JSON mapping

   When mapping qlog to normal JSON, the "qlog_format" field MUST have
   the value "JSON".  This is also the default qlog serialization and
   default value of this field.

Marx, et al.              Expires 25 April 2024                [Page 30]
Internet-Draft        Main logging schema for qlog          October 2023

   When using normal JSON serialization, the file extension/suffix
   SHOULD be ".qlog" and the Media Type (if any) SHOULD be "application/
   qlog+json" per [RFC6839].

   JSON files by definition ([RFC8259]) MUST utilize the UTF-8 encoding,
   both for the file itself and the string values.

   While not specifically required by the JSON specification, all qlog
   field names in a JSON serialization MUST be lowercase.

   In order to serialize CDDL-based qlog event and data structure
   definitions to JSON, the official CDDL-to-JSON mapping defined in
   Appendix E of [CDDL] SHOULD be employed.

10.1.1.  I-JSON

   For some use cases, it should be taken into account that not all
   popular JSON parsers support the full JSON format.  Especially for
   parsers integrated with the JavaScript programming language (e.g.,
   Web browsers, NodeJS), users are recommended to stick to a JSON
   subset dubbed [I-JSON] (or Internet-JSON).

   One of the key limitations of JavaScript and thus I-JSON is that it
   cannot represent full 64-bit integers in standard operating mode
   (i.e., without using BigInt extensions), instead being limited to the
   range of [-(2**53)+1, (2**53)-1].  In these circumstances, Appendix E
   of [CDDL] recommends defining new CDDL types for int64 and uint64
   that limit their values to this range.

   While this can be sensible and workable for most use cases, some
   protocols targeting qlog serialization (e.g., QUIC, HTTP/3), might
   require full uint64 variables in some (rare) circumstances.  In these
   situations, it should be allowed to also use the string-based
   representation of uint64 values alongside the numerical
   representation.  Concretely, the following definition of uint64
   should override the original and (web-based) tools should take into
   account that a uint64 field can be either a number or string.

   uint64 = text /
            uint .size 8

               Figure 34: Custom uint64 definition for I-JSON

Marx, et al.              Expires 25 April 2024                [Page 31]
Internet-Draft        Main logging schema for qlog          October 2023

10.1.2.  Truncated values

   For some use cases (e.g., limiting file size, privacy), it can be
   necessary not to log a full raw blob (using the hexstring type) but
   instead a truncated value (for example, only the first 100 bytes of
   an HTTP response body to be able to discern which file it actually
   contained).  In these cases, the original byte-size length cannot be
   obtained from the serialized value directly.

   As such, all qlog schema definitions SHOULD include a separate,
   length-indicating field for all fields of type hexstring they
   specify, see for example Section 7.  This not only ensures the
   original length can always be retrieved, but also allows the omission
   of any raw value bytes of the field completely (e.g., out of privacy
   or security considerations).

   To reduce overhead however and in the case the full raw value is
   logged, the extra length-indicating field can be left out.  As such,
   tools MUST be able to deal with this situation and derive the length
   of the field from the raw value if no separate length-indicating
   field is present.  The main possible permutations are shown by
   example in Figure 35.

Marx, et al.              Expires 25 April 2024                [Page 32]
Internet-Draft        Main logging schema for qlog          October 2023

   // both the full raw value and its length are present
   // (length is redundant)
   {
       "raw_length": 5,
       "raw": "051428abff"
   }

   // only the raw value is present, indicating it
   // represents the fields full value the byte
   // length is obtained by calculating raw.length / 2
   {
       "raw": "051428abff"
   }

   // only the length field is present, meaning the
   // value was omitted
   {
       "raw_length": 5,
   }

   // both fields are present and the lengths do not match:
   // the value was truncated to the first three bytes.
   {
       "raw_length": 5,
       "raw": "051428"
   }

          Figure 35: Example for serializing truncated hexstrings

10.2.  qlog to JSON Text Sequences mapping

   JSON Text Sequences are very similar to JSON, except that JSON
   objects are serialized as individual records, each prefixed by an
   ASCII Record Separator (<RS>, 0x1E), and each ending with an ASCII
   Line Feed character (\n, 0x0A).  Note that each record can also
   contain any amount of newlines in its body, as long as it ends with a
   newline character before the next <RS> character.

   Each qlog event is serialized and interpreted as an individual JSON
   Text Sequence record, and can simply be appended as a new object at
   the back of an event stream or log file.  Put differently, unlike
   default JSON, it does not require a file to be wrapped as a full
   object with "{ ... }" or "[... ]".

   For this to work, some qlog definitions have to be adjusted however.
   Mainly, events are no longer part of the "events" array in the Trace
   object, but are instead logged separately from the qlog "header", as
   indicated by the TraceSeq object in Figure 10.  Additionally, qlog's

Marx, et al.              Expires 25 April 2024                [Page 33]
Internet-Draft        Main logging schema for qlog          October 2023

   JSON-SEQ mapping does not allow logging multiple individual traces in
   a single qlog file.  As such, the QlogFile:traces field is replaced
   by the singular QlogFileSeq:trace field, see Figure 8.  An example
   can be seen in Figure 9.  Note that the "group_id" field can still be
   used on a per-event basis to include events from conceptually
   different sources in a single JSON-SEQ qlog file.

   When using JSON-SEQ serialization, the file extension/suffix SHOULD
   be ".sqlog" (for "streaming" qlog) and the Media Type (if any) SHOULD
   be "application/qlog+json-seq" per [RFC8091].

   While not specifically required by the JSON-SEQ specification, all
   qlog field names in a JSON-SEQ serialization MUST be lowercase.

   In order to serialize all other CDDL-based qlog event and data
   structure definitions to JSON-SEQ, the official CDDL-to-JSON mapping
   defined in Appendix E of [CDDL] SHOULD still be employed.

10.2.1.  Supporting JSON Text Sequences in tooling

   Note that JSON Text Sequences are not supported in most default
   programming environments (unlike normal JSON).  However, several
   custom JSON-SEQ parsing libraries exist in most programming languages
   that can be used and the format is easy enough to parse with existing
   implementations (i.e., by splitting the file into its component
   records and feeding them to a normal JSON parser individually, as
   each record by itself is a valid JSON object).

10.3.  Other optimized formatting options

   Both the JSON and JSON-SEQ formatting options described above are
   serviceable in general small to medium scale (debugging) setups.
   However, these approaches tend to be relatively verbose, leading to
   larger file sizes.  Additionally, generalized JSON(-SEQ)
   (de)serialization performance is typically (slightly) lower than that
   of more optimized and predictable formats.  Both aspects make these
   formats more challenging (though still practical
   (https://qlog.edm.uhasselt.be/anrw/)) to use in large scale setups.

   During the development of qlog, a multitude of alternative formatting
   and optimization options were compared.  The results of this study
   are summarized on the qlog github repository
   (https://github.com/quiclog/internet-drafts/issues/30#issuecomment-
   617675097).  The rest of this section discusses some of these
   approaches implementations could choose and the expected gains and
   tradeoffs inherent therein.  Tools SHOULD support mainly the
   compression options listed in Section 10.3.2, as they provide the
   largest wins for the least cost overall.

Marx, et al.              Expires 25 April 2024                [Page 34]
Internet-Draft        Main logging schema for qlog          October 2023

   Over time, specific qlog formats and encodings can be created that
   more formally define and combine some of the discussed optimizations
   or add new ones.  It was decided to define these schemes in separate
   documents to keep the main qlog definition clean and generalizable,
   as not all contexts require the same performance or flexibility as
   others and qlog is intended to be a broadly usable and extensible
   format (for example more flexibility is needed in earlier stages of
   protocol development, while more performance is typically needed in
   later stages).  This is also the main reason why the general qlog
   format is the less optimized JSON instead of a more performant
   option.

   To be able to easily distinguish between these options in qlog
   compatible tooling (without the need to have the user provide out-of-
   band information or to (heuristically) parse and process files in a
   multitude of ways, see also Section 12), it is recommended that
   explicit file extensions are used to indicate specific formats.  As
   there are no standards in place for this type of extension to format
   mapping, a commonly used scheme is proposed: list the applied
   optimizations in the extension in ascending order of application
   (e.g., if a qlog file is first optimized with technique A and then
   compressed with technique B, the resulting file would have the
   extension ".(s)qlog.A.B").  This allows tooling to start at the back
   of the extension to "undo" applied optimizations to finally arrive at
   the expected qlog representation.

10.3.1.  Data structure optimizations

   The first general category of optimizations is to alter the
   representation of data within an JSON(-SEQ) qlog file to reduce file
   size.

   The first option is to employ a scheme similar to the CSV (comma
   separated value [RFC4180]) format, which utilizes the concept of
   column "headers" to prevent repeating field names for each datapoint
   instance.  Concretely for JSON qlog, several field names are repeated
   with each event (i.e., time, name, data).  These names could be
   extracted into a separate list, after which qlog events could be
   serialized as an array of values, as opposed to a full object.  This
   approach was a key part of the original qlog format (prior to draft-
   02) using the "event_fields" field.  However, tests showed that this
   optimization only provided a mean file size reduction of 5% (100MB to
   95MB) while significantly increasing the implementation complexity,
   and this approach was abandoned in favor of the default JSON setup.
   Implementations using this format should not employ a separate file
   extension (as it still uses JSON), but rather employ a new value of
   "JSON.namedheaders" (or "JSON-SEQ.namedheaders") for the
   "qlog_format" field (see Section 3).

Marx, et al.              Expires 25 April 2024                [Page 35]
Internet-Draft        Main logging schema for qlog          October 2023

   The second option is to replace field values and/or names with
   indices into a (dynamic) lookup table.  This is a common compression
   technique and can provide significant file size reductions (up to 50%
   in tests, 100MB to 50MB).  However, this approach is even more
   difficult to implement efficiently and requires either including the
   (dynamic) table in the resulting file (an approach taken by for
   example Chromium's NetLog format
   (https://www.chromium.org/developers/design-documents/network-stack/
   netlog)) or defining a (static) table up-front and sharing this
   between implementations.  Implementations using this approach should
   not employ a separate file extension (as it still uses JSON), but
   rather employ a new value of "JSON.dictionary" (or "JSON-
   SEQ.dictionary") for the "qlog_format" field (see Section 3).

   As both options either proved difficult to implement, reduced qlog
   file readability, and provided too little improvement compared to
   other more straightforward options (for example Section 10.3.2),
   these schemes are not inherently part of qlog.

10.3.2.  Compression

   The second general category of optimizations is to utilize a
   (generic) compression scheme for textual data.  As qlog in the JSON(-
   SEQ) format typically contains a large amount of repetition, off-the-
   shelf (text) compression techniques typically succeed very well in
   bringing down file sizes (regularly with up to two orders of
   magnitude in tests, even for "fast" compression levels).  As such,
   utilizing compression is recommended before attempting other
   optimization options, even though this might (somewhat) increase
   processing costs due to the additional compression step.

   The first option is to use GZIP compression ([RFC1952]).  This
   generic compression scheme provides multiple compression levels
   (providing a trade-off between compression speed and size reduction).
   Utilized at level 6 (a medium setting thought to be applicable for
   streaming compression of a qlog stream in commodity devices), gzip
   compresses qlog JSON files to 7% of their initial size on average
   (100MB to 7MB).  For this option, the file extension .(s)qlog.gz
   SHOULD BE used.  The "qlog_format" field should still reflect the
   original JSON formatting of the qlog data (e.g., "JSON" or "JSON-
   SEQ").

   The second option is to use Brotli compression ([RFC7932]).  While
   similar to gzip, this more recent compression scheme provides a
   better efficiency.  It also allows multiple compression levels.
   Utilized at level 4 (a medium setting thought to be applicable for
   streaming compression of a qlog stream in commodity devices), brotli
   compresses qlog JSON files to 7% of their initial size on average

Marx, et al.              Expires 25 April 2024                [Page 36]
Internet-Draft        Main logging schema for qlog          October 2023

   (100MB to 7MB).  For this option, the file extension .(s)qlog.br
   SHOULD BE used.  The "qlog_format" field should still reflect the
   original JSON formatting of the qlog data (e.g., "JSON" or "JSON-
   SEQ").

   Other compression algorithms of course exist (for example xz, zstd,
   and lz4).  The gzip and brotli are recommended because of their
   tweakable behaviour and wide support in web-based environments, which
   is envisioned as the main tooling ecosystem (see also Section 12).

10.3.3.  Binary formats

   The third general category of optimizations is to use a more
   optimized (often binary) format instead of the textual JSON format.
   This approach inherently produces smaller files and often has better
   (de)serialization performance.  However, the resultant files are no
   longer human readable and some formats require hard tradeoffs between
   flexibility for performance.

   The first option is to use the CBOR (Concise Binary Object
   Representation [RFC7049]) format.  For the purposes of qlog, CBOR can
   be viewed as a straightforward binary variant of JSON.  As such,
   existing JSON qlog files can be trivially converted to and from CBOR
   (though slightly more work is needed for JSON-SEQ qlogs to convert
   them to CBOR-SEQ, see [RFC8742]).  While CBOR thus does retain the
   full qlog flexibility, it only provides a 25% file size reduction
   (100MB to 75MB) compared to textual JSON(-SEQ).  As CBOR support in
   programming environments is not as widespread as that of textual JSON
   and the format lacks human readability, CBOR was not chosen as the
   default qlog format.  For this option, the file extension
   .(s)qlog.cbor SHOULD BE used.  The "qlog_format" field should still
   reflect the original JSON formatting of the qlog data (e.g., "JSON"
   or "JSON-SEQ").  The media type should indicate both whether JSON or
   JSON Text Sequences are used, as well as whether CBOR or CBOR
   Sequences are used (see the table below).

   A second option is to use a more specialized binary format, such as
   Protocol Buffers (https://developers.google.com/protocol-buffers)
   (protobuf).  This format is battle-tested, has support for optional
   fields and has libraries in most programming languages.  Still, it is
   significantly less flexible than textual JSON or CBOR, as it relies
   on a separate, pre-defined schema (a .proto file).  As such, it it
   not possible to (easily) log new event types in protobuf files
   without adjusting this schema as well, which has its own practical
   challenges.  As qlog is intended to be a flexible, general purpose
   format, this type of format was not chosen as its basic
   serialization.  The lower flexibility does lead to significantly
   reduced file sizes.  A straightforward mapping of the qlog main

Marx, et al.              Expires 25 April 2024                [Page 37]
Internet-Draft        Main logging schema for qlog          October 2023

   schema and QUIC/HTTP3 event types to protobuf created qlog files 24%
   as large as the raw JSON equivalents (100MB to 24MB).  For this
   option, the file extension .(s)qlog.protobuf SHOULD BE used.  The
   "qlog_format" field should reflect the different internal format, for
   example: "qlog_format": "protobuf".

   Note that binary formats can (and should) also be used in conjunction
   with compression (see Section 10.3.2).  For example, CBOR compresses
   well (to about 6% of the original textual JSON size (100MB to 6MB)
   for both gzip and brotli) and so does protobuf (5% (gzip) to 3%
   (brotli)).  However, these gains are similar to the ones achieved by
   simply compression the textual JSON equivalents directly (7%, see
   Section 10.3.2).  As such, since compression is still needed to
   achieve optimal file size reductions event with binary formats, the
   more flexible compressed textual JSON options are likely a better
   default for the qlog format in general.

10.3.4.  Overview and summary

   In summary, textual JSON was chosen as the main qlog format due to
   its high flexibility and because its inefficiencies can be largely
   solved by the utilization of compression techniques (which are needed
   to achieve optimal results with other formats as well).

   Still, qlog implementers are free to define other qlog formats
   depending on their needs and context of use.  These formats should be
   described in their own documents, the discussion in this document
   mainly acting as inspiration and high-level guidance.  Implementers
   are encouraged to add concrete qlog formats and definitions to the
   designated public repository (https://github.com/quiclog/qlog).

   The following table provides an overview of all the discussed qlog
   formatting options with examples:

Marx, et al.              Expires 25 April 2024                [Page 38]
Internet-Draft        Main logging schema for qlog          October 2023

   +===============+===================+================+==============+
   | format        | qlog_format       | extension      | media type   |
   +===============+===================+================+==============+
   | JSON          | JSON              | .qlog          | application/ |
   | Section 10.1  |                   |                | qlog+json    |
   +---------------+-------------------+----------------+--------------+
   | JSON Text     | JSON-SEQ          | .sqlog         | application/ |
   | Sequences     |                   |                | qlog+json-   |
   | Section 10.2  |                   |                | seq          |
   +---------------+-------------------+----------------+--------------+
   | named         | JSON(-            | .(s)qlog       | application/ |
   | headers       | SEQ).namedheaders |                | qlog+json(-  |
   | Section       |                   |                | seq)         |
   | 10.3.1        |                   |                |              |
   +---------------+-------------------+----------------+--------------+
   | dictionary    | JSON(-            | .(s)qlog       | application/ |
   | Section       | SEQ).dictionary   |                | qlog+json(-  |
   | 10.3.1        |                   |                | seq)         |
   +---------------+-------------------+----------------+--------------+
   | CBOR Section  | JSON(-SEQ)        | .(s)qlog.cbor  | application/ |
   | 10.3.3        |                   |                | qlog+json(-  |
   |               |                   |                | seq)+cbor(-  |
   |               |                   |                | seq)         |
   +---------------+-------------------+----------------+--------------+
   | protobuf      | protobuf          | .qlog.protobuf | NOT          |
   | Section       |                   |                | SPECIFIED BY |
   | 10.3.3        |                   |                | IANA         |
   +---------------+-------------------+----------------+--------------+
   +---------------+-------------------+----------------+--------------+
   | gzip Section  | no change         | .gz suffix     | application/ |
   | 10.3.2        |                   |                | gzip         |
   +---------------+-------------------+----------------+--------------+
   | brotli        | no change         | .br suffix     | NOT          |
   | Section       |                   |                | SPECIFIED BY |
   | 10.3.2        |                   |                | IANA         |
   +---------------+-------------------+----------------+--------------+

                                  Table 1

10.4.  Conversion between formats

   As discussed in the previous sections, a qlog file can be serialized
   in a multitude of formats, each of which can conceivably be
   transformed into or from one another without loss of information.
   For example, a number of JSON-SEQ streamed qlogs could be combined
   into a JSON formatted qlog for later processing.  Similarly, a
   captured binary qlog could be transformed to JSON for easier
   interpretation and sharing.

Marx, et al.              Expires 25 April 2024                [Page 39]
Internet-Draft        Main logging schema for qlog          October 2023

   Secondly, other structured logging approaches contain similar (though
   typically not identical) data to qlog, like raw packet capture files
   (for example .pcap files from tcpdump) or endpoint-specific logging
   formats (for example the NetLog format in Google Chrome).  These are
   sometimes the only options, if an implementation cannot or will not
   support direct qlog output for any reason, but does provide other
   internal or external (e.g., SSLKEYLOGFILE export to allow decryption
   of packet captures) logging options For this second category, a
   (partial) transformation from/to qlog can also be defined.

   As such, when defining a new qlog serialization format or wanting to
   utilize qlog-compatible tools with existing codebases lacking qlog
   support, it is recommended to define and provide a concrete mapping
   from one format to default JSON-serialized qlog.  Several of such
   mappings exist.  Firstly, [pcap2qlog]((https://github.com/quiclog/
   pcap2qlog) transforms QUIC and HTTP/3 packet capture files to qlog.
   Secondly, netlog2qlog
   (https://github.com/quiclog/qvis/tree/master/visualizations/src/
   components/filemanager/netlogconverter) converts chromium's internal
   dictionary-encoded JSON format to qlog.  Finally, quictrace2qlog
   (https://github.com/quiclog/quictrace2qlog) converts the older
   quictrace format to JSON qlog.  Tools can then easily integrate with
   these converters (either by incorporating them directly or for
   example using them as a (web-based) API) so users can provide
   different file types with ease.  For example, the qvis
   (https://qvis.edm.uhasselt.be) toolsuite supports a multitude of
   formats and qlog serializations.

11.  Methods of access and generation

   Different implementations will have different ways of generating and
   storing qlogs.  However, there is still value in defining a few
   default ways in which to steer this generation and access of the
   results.

11.1.  Set file output destination via an environment variable

   To provide users control over where and how qlog files are created,
   two environment variables are defined.  The first, QLOGFILE,
   indicates a full path to where an individual qlog file should be
   stored.  This path MUST include the full file extension.  The second,
   QLOGDIR, sets a general directory path in which qlog files should be
   placed.  This path MUST include the directory separator character at
   the end.

   In general, QLOGDIR should be preferred over QLOGFILE if an endpoint
   is prone to generate multiple qlog files.  This can for example be
   the case for a QUIC server implementation that logs each QUIC

Marx, et al.              Expires 25 April 2024                [Page 40]
Internet-Draft        Main logging schema for qlog          October 2023

   connection in a separate qlog file.  An alternative that uses
   QLOGFILE would be a QUIC server that logs all connections in a single
   file and uses the "group_id" field (Section 6.6) to allow post-hoc
   separation of events.

   Implementations SHOULD provide support for QLOGDIR and MAY provide
   support for QLOGFILE.

   When using QLOGDIR, it is up to the implementation to choose an
   appropriate naming scheme for the qlog files themselves.  The chosen
   scheme will typically depend on the context or protocols used.  For
   example, for QUIC, it is recommended to use the Original Destination
   Connection ID (ODCID), followed by the vantage point type of the
   logging endpoint.  Examples of all options for QUIC are shown in
   Figure 36.

   Command: QLOGFILE=/srv/qlogs/client.qlog quicclientbinary

   Should result in the the quicclientbinary executable logging a
   single qlog file named client.qlog in the /srv/qlogs directory.
   This is for example useful in tests when the client sets up
   just a single connection and then exits.

   Command: QLOGDIR=/srv/qlogs/ quicserverbinary

   Should result in the quicserverbinary executable generating
   several logs files, one for each QUIC connection.
   Given two QUIC connections, with ODCID values "abcde" and
   "12345" respectively, this would result in two files:
   /srv/qlogs/abcde_server.qlog
   /srv/qlogs/12345_server.qlog

   Command: QLOGFILE=/srv/qlogs/server.qlog quicserverbinary

   Should result in the the quicserverbinary executable logging
   a single qlog file named server.qlog in the /srv/qlogs directory.
   Given that the server handled two QUIC connections before it was
   shut down, with ODCID values "abcde" and "12345" respectively,
   this would result in event instances in the qlog file being
   tagged with the "group_id" field with values "abcde" and "12345".

     Figure 36: Environment variable examples for a QUIC implementation

Marx, et al.              Expires 25 April 2024                [Page 41]
Internet-Draft        Main logging schema for qlog          October 2023

12.  Tooling requirements

   Tools ingestion qlog MUST indicate which qlog version(s), qlog
   format(s), compression methods and potentially other input file
   formats (for example .pcap) they support.  Tools SHOULD at least
   support .qlog files in the default JSON format (Section 10.1).
   Additionally, they SHOULD indicate exactly which values for and
   properties of the name (category and type) and data fields they look
   for to execute their logic.  Tools SHOULD perform a (high-level)
   check if an input qlog file adheres to the expected qlog schema.  If
   a tool determines a qlog file does not contain enough supported
   information to correctly execute the tool's logic, it SHOULD generate
   a clear error message to this effect.

   Tools MUST NOT produce breaking errors for any field names and/or
   values in the qlog format that they do not recognize.  Tools SHOULD
   indicate even unknown event occurrences within their context (e.g.,
   marking unknown events on a timeline for manual interpretation by the
   user).

   Tool authors should be aware that, depending on the logging
   implementation, some events will not always be present in all traces.
   For example, using a circular logging buffer of a fixed size, it
   could be that the earliest events (e.g., connection setup events) are
   later overwritten by "newer" events.  Alternatively, some events can
   be intentionally omitted out of privacy or file size considerations.
   Tool authors are encouraged to make their tools robust enough to
   still provide adequate output for incomplete logs.

13.  Security and privacy considerations

   Protocols such as TLS [RFC8446] and QUIC [RFC9000] provide varying
   degrees of secure protection for the wire image [RFC8546].  There is
   inevitably tension between security and observability, when logging
   can reveal aspects of the wire image, that would ordinarily be
   protected.  This tension equally applies to any privacy
   considerations that build on security properties, especially if data
   can be correlated across data sources.

   qlog operators and implementers should be mindful of the security and
   privacy risks inherent in handling qlog data.  This includes but is
   not limited to logging, storing, or using the data.  Data might be
   considered as non-sensitive, potentially-sensitive, or sensitive;
   applying the considerations in this section may produce different
   risks depending on the nature of the data itself, or its handling.
   However, in many cases the largest risk factors arise from data that
   can be considered as potenially-sensitive or sensitive.

Marx, et al.              Expires 25 April 2024                [Page 42]
Internet-Draft        Main logging schema for qlog          October 2023

   The following is a non-exhaustive list of such fields and types of
   data that can be carried in qlog data:

   *  IP addresses and transport protocol port numbers, which can be
      used to uniquely identify individual connections, endpoints, and
      potentially users.

   *  Session, Connection, or User identifiers which can be used to
      correlate nominally separate contexts.  For example, QUIC
      Connection IDs can be used to identify and track users across
      geographical networks Section 9.5 of [RFC9000]).

   *  System-level information such as CPU, process, or thread
      identifiers.

   *  Stored State which can be used to correlate individual connections
      or sessions over time.  Examples include QUIC address validation
      and retry tokens, TLS session tickets, and HTTP cookies.

   *  Decryption keys, passwords, and tokens which can be used with
      other data sources (e.g., captures of encrypted packets) to
      correlate qlog data to a specific connection or user or leak
      additional information.  Examples include TLS decryption keys and
      HTTP-level API access or authorization tokens.

   *  Data that can be used to correlate qlogs to other data sources
      (e.g., captures of encrypted packets).  Examples include high-
      resolution event timestamps or inter-event timings, event counts,
      packet and frame sizes.

   *  Full or partial encrypted raw packet and frame payloads, which can
      be used with other data sources (e.g., captures of encrypted
      packets) to correlate qlog data to a specific connection or
      session.

   *  Full or partial plaintext raw packet and frame payloads (e.g.,
      HTTP Field values, HTTP response data, TLS SNI field values),
      which can contain directly sensitive information.

   The simplest and most extreme form of protection against abuse of
   this information is the complete deletion of a given field, which is
   equivalent to not logging the field(s) in question.  While deletion
   completely protects the data in the deleted fields from the risk of
   compromise, it also reduces the utility of the dataset as a whole.
   As such, a balance should be found between logging these fields and
   the potential risks inherent in their (involuntary) disclosure.  This
   balance depends on the use case at hand (e.g., research datasets
   might have different requirements to live operational

Marx, et al.              Expires 25 April 2024                [Page 43]
Internet-Draft        Main logging schema for qlog          October 2023

   troubleshooting).  Capturing the minimal amount of data required for
   a specific purpose can help to minimize the risks associated with
   data usage. qlog implementations that provide fine-grained control
   over the inclusion of data fields, ideally on a per-use-case or per-
   connection basis, improve the ability to minimize data.

   Any data that is determined to be necessary for a use case at hand
   could be logged or captured.  As per [RFC6973], operators must be
   aware that such data will be at risk of compromise.  As such,
   measures should be taken to firstly reduce the risk of compromise and
   secondly reduce the risk of abuse of compromised data.  While a full
   discussion of both aspects is out of scope for this document, the
   following paragraphs discuss high-level considerations that can be
   applied to qlog data.

   To reduce the risk of compromise, operators can take measures such
   as: limiting the length of time that data is stored, encrypting data
   in transit and at rest, limiting access rights to the data, and
   auditing data usage practices. qlog deployments that provide
   integrated options for automated or manual data deletion and
   (aggressive) aggregation, improve the ability to minimize the risk of
   compromise.

   To reduce the risk of data abuse after compromise, data can be
   anonymized, pseudonymized, otherwise permutated/replaced, truncated,
   (re-)encrypted, or aggregated.  A partial discussion of applicable
   techniques (especially for IP address information) can be found in
   Appendix B of [DNS-PRIVACY].  Operators should, however, be aware
   that many of these techniques have been shown to be insufficient to
   safeguard user privacy and/or to protect user identity, especially if
   a qlog data set is large or easily correlated against other data
   sources.

   Finally, qlog operators should consider the interplay between their
   use case needs and end user rights or preferences.  While active user
   participation (as indicated by [RFC6973]) on a per-qlog basis is
   difficult, as logs are often captured out-of-band to the main user
   interaction and intent, general user expectations should be taken
   into account. qlog deployments that provide mechanisms to integrate
   the capture, storage and removal of qlogs with more general, often
   pre-existing, user preference and privacy control systems, improve
   the ability to protect data sensitive or confidential to the end
   user.  In qlog, these data are typically (but not exclusively)
   contained in fields of the RawInfo type (see Section 7). qlog users
   should thus be particularly hesitant to include these fields for all
   but the most stringent use cases.

Marx, et al.              Expires 25 April 2024                [Page 44]
Internet-Draft        Main logging schema for qlog          October 2023

14.  IANA Considerations

   There are no IANA considerations.

15.  References

15.1.  Normative References

   [CDDL]     Birkholz, H., Vigano, C., and C. Bormann, "Concise Data
              Definition Language (CDDL): A Notational Convention to
              Express Concise Binary Object Representation (CBOR) and
              JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610,
              June 2019, <https://www.rfc-editor.org/rfc/rfc8610>.

   [DNS-PRIVACY]
              Dickinson, S., Overeinder, B., van Rijswijk-Deij, R., and
              A. Mankin, "Recommendations for DNS Privacy Service
              Operators", BCP 232, RFC 8932, DOI 10.17487/RFC8932,
              October 2020, <https://www.rfc-editor.org/rfc/rfc8932>.

   [I-JSON]   Bray, T., Ed., "The I-JSON Message Format", RFC 7493,
              DOI 10.17487/RFC7493, March 2015,
              <https://www.rfc-editor.org/rfc/rfc7493>.

   [JSON]     Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
              Interchange Format", STD 90, RFC 8259,
              DOI 10.17487/RFC8259, December 2017,
              <https://www.rfc-editor.org/rfc/rfc8259>.

   [JSON-Text-Sequences]
              Williams, N., "JavaScript Object Notation (JSON) Text
              Sequences", RFC 7464, DOI 10.17487/RFC7464, February 2015,
              <https://www.rfc-editor.org/rfc/rfc7464>.

   [RFC1952]  Deutsch, P., "GZIP file format specification version 4.3",
              RFC 1952, DOI 10.17487/RFC1952, May 1996,
              <https://www.rfc-editor.org/rfc/rfc1952>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

   [RFC4180]  Shafranovich, Y., "Common Format and MIME Type for Comma-
              Separated Values (CSV) Files", RFC 4180,
              DOI 10.17487/RFC4180, October 2005,
              <https://www.rfc-editor.org/rfc/rfc4180>.

Marx, et al.              Expires 25 April 2024                [Page 45]
Internet-Draft        Main logging schema for qlog          October 2023

   [RFC6839]  Hansen, T. and A. Melnikov, "Additional Media Type
              Structured Syntax Suffixes", RFC 6839,
              DOI 10.17487/RFC6839, January 2013,
              <https://www.rfc-editor.org/rfc/rfc6839>.

   [RFC6973]  Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
              Morris, J., Hansen, M., and R. Smith, "Privacy
              Considerations for Internet Protocols", RFC 6973,
              DOI 10.17487/RFC6973, July 2013,
              <https://www.rfc-editor.org/rfc/rfc6973>.

   [RFC7049]  Bormann, C. and P. Hoffman, "Concise Binary Object
              Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049,
              October 2013, <https://www.rfc-editor.org/rfc/rfc7049>.

   [RFC7464]  Williams, N., "JavaScript Object Notation (JSON) Text
              Sequences", RFC 7464, DOI 10.17487/RFC7464, February 2015,
              <https://www.rfc-editor.org/rfc/rfc7464>.

   [RFC7932]  Alakuijala, J. and Z. Szabadka, "Brotli Compressed Data
              Format", RFC 7932, DOI 10.17487/RFC7932, July 2016,
              <https://www.rfc-editor.org/rfc/rfc7932>.

   [RFC8091]  Wilde, E., "A Media Type Structured Syntax Suffix for JSON
              Text Sequences", RFC 8091, DOI 10.17487/RFC8091, February
              2017, <https://www.rfc-editor.org/rfc/rfc8091>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

   [RFC8259]  Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
              Interchange Format", STD 90, RFC 8259,
              DOI 10.17487/RFC8259, December 2017,
              <https://www.rfc-editor.org/rfc/rfc8259>.

   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
              <https://www.rfc-editor.org/rfc/rfc8446>.

   [RFC9000]  Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
              Multiplexed and Secure Transport", RFC 9000,
              DOI 10.17487/RFC9000, May 2021,
              <https://www.rfc-editor.org/rfc/rfc9000>.

15.2.  Informative References

Marx, et al.              Expires 25 April 2024                [Page 46]
Internet-Draft        Main logging schema for qlog          October 2023

   [QLOG-H3]  Marx, R., Niccolini, L., Seemann, M., and L. Pardue,
              "HTTP/3 and QPACK qlog event definitions", Work in
              Progress, Internet-Draft, draft-ietf-quic-qlog-h3-events-
              05, 10 July 2023, <https://datatracker.ietf.org/doc/html/
              draft-ietf-quic-qlog-h3-events-05>.

   [QLOG-QUIC]
              Marx, R., Niccolini, L., Seemann, M., and L. Pardue, "QUIC
              event definitions for qlog", Work in Progress, Internet-
              Draft, draft-ietf-quic-qlog-quic-events-05, 10 July 2023,
              <https://datatracker.ietf.org/doc/html/draft-ietf-quic-
              qlog-quic-events-05>.

   [RFC8546]  Trammell, B. and M. Kuehlewind, "The Wire Image of a
              Network Protocol", RFC 8546, DOI 10.17487/RFC8546, April
              2019, <https://www.rfc-editor.org/rfc/rfc8546>.

   [RFC8742]  Bormann, C., "Concise Binary Object Representation (CBOR)
              Sequences", RFC 8742, DOI 10.17487/RFC8742, February 2020,
              <https://www.rfc-editor.org/rfc/rfc8742>.

Acknowledgements

   Much of the initial work by Robin Marx was done at the Hasselt and KU
   Leuven Universities.

   Thanks to Jana Iyengar, Brian Trammell, Dmitri Tikhonov, Stephen
   Petrides, Jari Arkko, Marcus Ihlar, Victor Vasiliev, Mirja
   Kuehlewind, and Jeremy Laine for their feedback and suggestions.

Change Log

   This section is to be removed before publishing as an RFC.

Since draft-ietf-quic-qlog-main-schema-06:

   *  Editorial reworking of the document (#331, #332)

   *  Updated IANA considerations section (#333)

Since draft-ietf-quic-qlog-main-schema-05:

   *  Updated qlog_version to 0.4 (due to breaking changes) (#314)

   *  Renamed 'transport' category to 'quic' (#302)

   *  Added 'system_info' field (#305)

Marx, et al.              Expires 25 April 2024                [Page 47]
Internet-Draft        Main logging schema for qlog          October 2023

   *  Removed 'summary' and 'configuration' fields (#308)

   *  Editorial and formatting changes (#298, #303, #304, #316, #320,
      #321, #322, #326, #328)

Since draft-ietf-quic-qlog-main-schema-04:

   *  Updated RawInfo definition and guidance (#243)

Since draft-ietf-quic-qlog-main-schema-03:

   *  Added security and privacy considerations discussion (#252)

Since draft-ietf-quic-qlog-main-schema-02:

   *  No changes - new draft to prevent expiration

Since draft-ietf-quic-qlog-main-schema-01:

   *  Change the data definition language from TypeScript to CDDL (#143)

Since draft-ietf-quic-qlog-main-schema-00:

   *  Changed the streaming serialization format from NDJSON to JSON
      Text Sequences (#172)

   *  Added Media Type definitions for various qlog formats (#158)

   *  Changed to semantic versioning

Since draft-marx-qlog-main-schema-draft-02:

   *  These changes were done in preparation of the adoption of the
      drafts by the QUIC working group (#137)

   *  Moved RawInfo, Importance, Generic events and Simulation events to
      this document.

   *  Added basic event definition guidelines

   *  Made protocol_type an array instead of a string (#146)

Since draft-marx-qlog-main-schema-01:

   *  Decoupled qlog from the JSON format and described a mapping
      instead (#89)

Marx, et al.              Expires 25 April 2024                [Page 48]
Internet-Draft        Main logging schema for qlog          October 2023

      -  Data types are now specified in this document and proper
         definitions for fields were added in this format

      -  64-bit numbers can now be either strings or numbers, with a
         preference for numbers (#10)

      -  binary blobs are now logged as lowercase hex strings (#39, #36)

      -  added guidance to add length-specifiers for binary blobs (#102)

   *  Removed "time_units" from Configuration.  All times are now in ms
      instead (#95)

   *  Removed the "event_fields" setup for a more straightforward JSON
      format (#101,#89)

   *  Added a streaming option using the NDJSON format (#109,#2,#106)

   *  Described optional optimization options for implementers (#30)

   *  Added QLOGDIR and QLOGFILE environment variables, clarified the
      .well-known URL usage (#26,#33,#51)

   *  Overall tightened up the text and added more examples

Since draft-marx-qlog-main-schema-00:

   *  All field names are now lowercase (e.g., category instead of
      CATEGORY)

   *  Triggers are now properties on the "data" field value, instead of
      separate field types (#23)

   *  group_ids in common_fields is now just also group_id

Authors' Addresses

   Robin Marx (editor)
   Akamai
   Email: rmarx@akamai.com

   Luca Niccolini (editor)
   Meta
   Email: lniccolini@meta.com

Marx, et al.              Expires 25 April 2024                [Page 49]
Internet-Draft        Main logging schema for qlog          October 2023

   Marten Seemann (editor)
   Protocol Labs
   Email: martenseemann@gmail.com

   Lucas Pardue (editor)
   Cloudflare
   Email: lucaspardue.24.7@gmail.com

Marx, et al.              Expires 25 April 2024                [Page 50]
Main logging schema for qlog draft-ietf-quic-qlog-main-schema-07

Main logging schema for qlog
draft-ietf-quic-qlog-main-schema-07