Skip to main content

Early Review of draft-elkhatabi-verifiable-telemetry-ledgers-05
review-elkhatabi-verifiable-telemetry-ledgers-05-opsdir-early-clarke-2026-05-27-00

Request Review of draft-elkhatabi-verifiable-telemetry-ledgers-04
Requested revision 04 (document currently at 07)
Type Early Review
Team Ops Directorate (opsdir)
Deadline 2026-06-15
Requested 2026-05-24
Requested by Eliot Lear
Authors Bilal El Khatabi
I-D last updated 2026-06-04 (Latest revision 2026-06-04)
Completed reviews Opsdir Early review of -05 by Joe Clarke (diff)
Comments
Questions that the ISE would like you to answer:
A. Is the subject of this document relevant to the RFC Series?

Although the series was originally broadly scoped to include any aspect of computer networking, its scope has become centered on the Internet.

B. Is this document technically competent, as far as you can tell?

Does the work build upon industry state-of-the-art?  Are protocols and interfaces well specified?

C. Is this document in reasonable (not necessarily final) editorial shape?

Works should concisely and clearly make their point.  Was it easy to discern the point, and could you understand the approach being offered?

D. Are the Abstract and Introduction of this document reasonably clear?

Does the introduction provide enough background for those Internet techies who may not be experts in the particular subject matter?  Do the Title and Abstract fairly and accurately summarize the contents?

E. Does the document make clear upfront how the specification does/does not relate to past or current IETF activities?

F. How else can the document be improved?
Assignment Reviewer Joe Clarke
State Completed
Request Early review on draft-elkhatabi-verifiable-telemetry-ledgers by Ops Directorate Assigned
Posted at https://mailarchive.ietf.org/arch/msg/ops-dir/E60pN9b9uJ3ttdzqAlPJnuDvOrQ
Reviewed revision 05 (document currently at 07)
Result Has issues
Completed 2026-05-27
review-elkhatabi-verifiable-telemetry-ledgers-05-opsdir-early-clarke-2026-05-27-00
Hi,

I have been selected as the Operational Directorate (opsdir) reviewer for this
Internet-Draft.

The Operational Directorate reviews all operational and management-related
Internet-Drafts to ensure alignment with operational best practices and that
adequate operational considerations are covered.

A complete set of _"Guidelines for Considering Operations and Management in
IETF Specifications"_ can be found at
https://datatracker.ietf.org/doc/draft-ietf-opsawg-rfc5706bis/.

While these comments are primarily for the Operations and Management Area
Directors (Ops ADs), the authors should consider them alongside other feedback
received.

- Document: draft-elkhatabi-verifiable-telemetry-ledgers-05

- Reviewer: Joe Clarke

- Review Date: May 27, 2026

- Intended Status: Informational

---

## Summary

Choose one: **Has Issues**

## General Operational Comments Alignment with RFC 5706bis

I recommend the authors consider adding a short, consolidated "Operational
Considerations" section that gathers and cross-references the operationally
relevant text already scattered through the draft.

Per RFC 5706bis, an informational document with this much deployment surface
would benefit from an explicit Operational Considerations section. I'd suggest
a short section (e.g., a new Section 11) that covers:

* Health/fault management: what an operator should monitor (replay-state
persistence, AEAD-failure rate, anchoring backlog, calendar reachability,
day-artifact write success, OTS upgrade lag). * Configuration management:
enumeration of deployment-tunable parameters (acceptance window, connectivity
window, retention period, calendars used, anchoring mode warn/strict,
peer-quorum threshold) with recommended defaults where applicable. *
Performance/scaling: expected artifact size relative to record count, OTS
submission rate, and effect of multi-calendar submission. * Verifying correct
operation: how an operator (not auditor) sanity checks that the pipeline is
healthy.

Additionally, I did find a few "minor" issues that should be shored up to make
this ledger more consumable and deployable.

First, in your introduction, you state: "The gateway is expected to maintain
UTC time for ingest_time assignment and day-artifact rollover. The device is
not required to keep UTC wall-clock time for this profile."  But no where in
here do I see guidance that the gateway should be using an authoritative time
source.  Maybe shore this up with something like this text (maybe in a new
Operational Considerations section):

NEW:

The gateway is expected to maintain UTC time for ingest_time assignment and
day-artifact rollover. Deployments SHOULD use an authenticated
time-synchronization mechanism (for example, authenticated NTP or equivalent)
and SHOULD monitor clock health, since gateway clock errors propagate directly
into committed ingest_time values and UTC day-boundary assignment. Behavior on
detected clock step, regression, or loss of synchronization is a deployment
policy matter and SHOULD be documented by the operator. The device is not
required to keep UTC wall-clock time for this profile.

Related to this, I see "naked" references to OTS and RFC3161 in this doc. 
Outside of the abstract, those should be linked xrefs.

Section 4.2 states "Gateways SHOULD persist replay state across restart." Given
that loss of replay state is described in Section 10 as enabling key-reuse and
unsafe acceptance scenarios, I'd argue this is closer to a MUST in practice, or
at minimum the SHOULD should be paired with the explicit consequence. What
about:

OLD:

Gateways SHOULD persist replay state across restart. If replay state is lost,
gateways SHOULD record a continuity break event and SHOULD NOT silently
re-accept counters that could already have been committed.

NEW:

Gateways SHOULD persist replay state across restart. Loss of replay state can
lead to nonce reuse and unsafe acceptance under the same AEAD key (see Section
10). If replay state is lost, gateways MUST record a continuity-break event,
MUST NOT silently re-accept counters that could already have been committed,
and MUST require explicit resynchronization or AEAD-key rotation before further
frames from affected devices are accepted.

In Section 7, the Class A description contains:

A Class A verifier can recompute canonical-record digests from disclosed record
artifacts, the **ADR-003** Merkle root, ...

"ADR-003" doesn't show up anywhere else in this draft. Maybe this is some
hold-over from something deployment or project-specific?

OLD:

A Class A verifier can recompute canonical-record digests from disclosed record
artifacts, the ADR-003 Merkle root, block/day consistency, the day artifact
digest, manifest artifact digests, and enabled anchoring or publication proofs
when those artifacts are present.

NEW:

A Class A verifier can recompute canonical-record digests from disclosed record
artifacts, the batch Merkle root as defined in Section 4.5, block/day
consistency, the day artifact digest, manifest artifact digests, and enabled
anchoring or publication proofs when those artifacts are present.

Likewise, Appendix E enumerates reasons including **postcard_pod_id_mismatch**
and **postcard_fc_mismatch**. The term "postcard" is not introduced anywhere in
the document body. Either define it (or reference it) or rename these reasons
to something more general.

Throughout this draft, "trackone" is used for media types. Why is "trackone" in
the name?  This appears to be a project name/codename.  I feel this should be
more generic if this is being proposed as an industry document.

---