Last Call Review of draft-ietf-dnsop-dns-error-reporting-06
review-ietf-dnsop-dns-error-reporting-06-dnsdir-lc-tale-2023-11-05-00
review-ietf-dnsop-dns-error-reporting-06-dnsdir-lc-tale-2023-11-05-00
Hi Roy and Matt, this is my review on behalf of the DNS Directorate. There are only a couple of minor substantive points that I think really should be addressed before publication, and then my usual wordy musings on a number of other nits. Minor substantive points: Section 4 doesn't really describe which responses should get the Report-Channel option pushed. Obviously it requires EDNS0, and I'd guess I don't want to include it for unsigned responses. I'm also guessing I'd not send it if DO were clear, and that I probably would send it for any other DNSSEC-related query including for DNSSEC meta records. As an implementer, I'd prefer to not have to guess at those things and rather be told when I should be including it. Apparently in section 6.2. Also, for Section 5, is 0 an okay OPTION-LENGTH or must it be minimum 1 with an AGENT-DOMAIN of \0? Similarly, in 6.1 there is "returned an empty agent domain", which is described more expansively in the Overview as "empty, or is the null label" and suggests 6.1 should be clearer on what constitutes an "empty" agent domain. Bag o' Nits: Hyphenate adjectival phrase: s/stale DNSSEC signed zone/stale DNSSEC-signed zone/. For the Terminology section, perhaps start with something like, "DNS Terminology used in this document is from [BCP219], with these additions:" Because, for example, the first item, "reporting resolver", uses "validating recursive resolver" to define it. (I'll note, however, that BCP 219 defines "validating resolver" but not "validating recursive resolver". I'm not sure that "recursive" is a strictly necessary qualifier here.) (Also, I don't remember what the IETF style guidance is on reference by BCP# or by RFC#, and you did include BCP219 as RFC8499 later on.) s/wireformat/wire format/g per usage in BCP 219. "The reporting resolver builds this QNAME by concatenating the _er label" has _er unquoted here, but quoted later in the same sentence. They should be consistent; I favour the quotes. But also, the actual algorithm is specified in 6.1.1, so restating it here without even referencing that 6.1.1 is the authoritative definition is not ideal and perhaps should just be "... builds this QNAME by the algorithm in Section 6.1.1." Relatedly, though 4.2 spells it out, as I was reading this paragraph I wanted to see an example right away. Because you said "label" as an old DNS geek I could infer you meant "prepend _er with a dot" but I did first have the thought flash through my mind about whether the dot was there. Maybe I'm odd that way, but scrolling down and back to satisfy the visualization is slightly more cognitive work. "See the example in Section 4.2" could be something like. "This results in a name like _er.1.broken.test.7._er.a01.agent-domain.example., as constructed in the example of section 4.2." "The report query will ultimately arrive at the monitoring agent" is kind of odd to me, not because it isn't true but because it seems like a basic property of the DNS that we've been relying on for 40 years. The first three sentences could be shortened up a bit to something like, "The monitoring agent's reply to the report query MUST be cached by the reporting resolver." MUST describes it as essential, and the remaining sentences describe why. I'd like to see some recommendations for a suitable TTL on the reporting agent's reply. Also, why no guidance on the TXT rdata? Even something like "it could be as short as a null record to minimize cache overhead, or could contain additional information that the authority wishes to communicate to the resolver." For "This QNAME indicates extended DNS error 7", append ", Signature Expired, " as a useful appositive for the reader. The last paragraph of section 4.2 seems overlong. The first parenthetical is an unnecessary restatement of Terminology. "The agent can determine" sentence basically restates the last sentence of the prior paragraph. Finally "The monitoring agent can contact the operators" sentence, while it does explicitly recognize that the operator can be a different entity, reads a little odd for what I would expect to be the most common case of monitor and operator being the same entity. It also isn't clear that it adds anything that isn't intrinsically obvious, that the report could then be the trigger for fixing the problem. It looks to me like the whole last paragraph could be dropped with no important loss of meaning. "The DNS class is not specified in the error report." Huh, so, yeah, that's interesting. I honestly hadn't even though about class until this moment. Maybe "SHOULD assume IN" is a worthwhile addition somewhere, since that's nearly the entirety of DNS traffic, and the only class for which DNSSEC is currently defined? (I mean, if you want to start signing CHAOS TXT and reporting on failures, we've got a lot of other work to do.) "The reporting resolver MUST NOT use DNS error reporting to report a failure in resolving the report query." This feels ambigous to me, because even as an old DNSSEC geek I would, in the vernacular, describe a failure to validate as a failure to resolve. Short example phrases of what sort of thing you don't want to see happen would be good. 6.1.1, "When the specified agent domain is empty, or a null label (despite being not allowed in this specification), the report query will have "_er" as a top-level domain as a result and not the original query." Is this sentence useful? I can see it as a bit of a rationale for why to have a second _er, but the first sentence already provides sufficient rationale, and an empty/root agent domain already means the rest of this section is irrelevant. 6.3 I'd put RECOMMENDED TTL range here, as just how often the agent wants to keep hearing about failures for the same problem, and adding consideration for just how many resolvers could be responding for how many q-tuples. Even just magnitude would help. (Me? In the absence of any guidance I'd probably go with 15 minutes but could see anything from 5 to 60 being pretty reasonable.) s/Well known addresses/Well-known addresses/ Thanks for all your hard work on this draft! --tale