Summary: Has 2 DISCUSSes. Has enough positions to pass once DISCUSS positions are resolved.
Thank you for resolving most of my DISCUSS points (the additional clarification about when the one-hour retry does (not) apply in particular is helpful even if not strictly needed). However, it seems that my point about an "application protocol profile" for TLS 1.3 0-RTT was deferred until the resolution of a different thread covering 0-RTT, but that we never picked it back up. I think this document's profile needs to provide greater clarity about what specific messages are (or are not) allowed in 0-RTT data, i.e., listing permitted TLVs and having a column in the registry for whether a TLV is 0-RTT-safe. For example, Retry Delay simply cannot appear in a DSO message that is eligible for 0-RTT, so it accordingly cannot appear in a 0-RTT message, while EncryptionPadding should be safe to appear [as an additional TLV], provided there is a suitable primary TLV to attach it to. On first look [after long absence], the keepalive TLV should be safe to use as a 0-RTT primary TLV, since it elicits a response and only affects the current connection. Anyway, my main point here is that we need to place the burden of analysis on a TLV's safety on the authors of the spec introducing that TLV, and not on implementors or users of the protocol.
[COMMENT section unchanged from original ballot; contents may no longer apply] Six authors exceeds five, so "there is likely to be discussion" about this being too large a number of authors. What is the justification for the author count? Do we need to specify some GREASE (per draft-ietf-tls-grease) for new TLV types in order to ensure the proper handling of unknown types? Section-by-section comments follow. Section 1 A DoH reference seems timely/apt. (But maybe then it is only "Some such transports" that can offer persistent sessions.) Maybe give some examples of advantages of server-initiated messages? Are we talking about letting the server push records with larger TTLs or notifying when the response to a query is changing, or just more mundane keepalive-type things? Section 3 The terms "initiator" and "responder" correspond respectively to the initial sender and subsequent receiver of a DSO request message or unacknowledged message, regardless of which was the "client" and "server" in the usual DNS sense. We just defined "client" and "server" explictly (without reference to the "usual DNS sense"), so probably it's best to have this definition refer to the previous client/server definitions or clarify that the above definitions match the "usual DNS sense". When an anycast service is configured on a particular IP address and port, it must be the case that although there is more than one physical server responding on that IP address, each such server can be treated as equivalent. If a change in network topology causes packets in a particular TCP connection to be sent to an anycast server instance that does not know about the connection, the normal keepalive and TCP connection timeout process will allow for recovery. If after the connection is re-established, [...] Perhaps clarifying that "recovery" means "detecting the broken session and starting a new one" would be useful? (I guess Spencer's DISCUSS takes this a different direction.) DSO unacknowledged messages are unidirectional messages and do not generate any response. "Do not generate any response" at the DNS layer; any reason to mention that TCP will still ACK the bytes (or rather, that the "reliable" part of the data stream will need to do so)? Section 5.1 DSO messages MUST be carried in only protocols and in environments where a session may be established according to the definition given above in the Terminology section (Section 3). nit: is this "in only" or "only in" If the RCODE is set to any value other than NOERROR (0) or DSOTYPENI ([TBA2] tentatively 11), then the client MUST assume that the server does not implement DSO at all. In this case the client is permitted to continue sending DNS messages on that connection, but the client SHOULD NOT issue further DSO messages on that connection. I'm confused how the server would still have proper framing for subsequent DNS messages, since the DSO TLVs would be "spurious extra data" after a request header and potentially subject to misinterpretation as the start of another DNS message header. Section 5.1.3 It is probably worth explicitly noting that a middlebox MUST NOT make/forward a DSO request with TLVs that it does not implement. Section 5.2 If a DSO message is received where any of the count fields are not zero, then a FORMERR MUST be returned, unless a future IETF Standard specifies otherwise. This seems like ... not the conventional wording for this behavior, and subject to large debates about the meaning of "IETF Standard". (Similar language is used elsewhere, too.) Section 5.2.2 The start of this section seems to duplicate a lot of Section 3 -- e.g., the specification of the "Primary TLV"; request/response/unacknowledged as the types of messages; etc. It's unclear to me that this duplication of content is helpful to the reader. Unacknowledged messages MUST NOT be used "speculatively" in cases where the sender doesn't know if the receiver supports the Primary TLV in the message, because there is no way to receive any response to indicate success or failure of the request message (the request message does not contain a unique MESSAGE ID with which to associate a response with its corresponding request). Unacknowledged messages are only appropriate in cases where the sender already knows that the receiver supports, and wishes to receive, these messages. Having gone to the trouble to explicitly define (twice!) "request", "response" and "unacknowledged", it's pretty confusing to then use the English word "request" to refer to an "unacknowledged" message. Section 18.104.22.168 The specification for a TLV states whether that DSO-TYPE may be used in "Primary", "Additional", "Response Primary", or "Response Additional" TLVs. Perhaps this could be wordsmithed to avoid accidental misreading as exclusive-or? Section 5.3.1 When a DSO unacknowledged message is unsuccessful for some reason, the responder immediately aborts the connection. Doesn't this kill the client/server pairing for an hour? "For some reason" is very vague to induce such behavior, and could include transient internal errors. Section 6.1 When would it be appropriate for the server to send responses out of order? Section 22.214.171.124 nit: "RECONNECT DELAY" is used with inconsistent capitalization. Section 7.1 The description of the two timeout fields is predicated on understanding that it is only the response's incarnation of them that is authoritative; as an editorial matter, it might be nice to introduce this fact earlier. Section 7.1.1 It seems like we could consolidate the "equal to" and "greater than" cases into "greater than or equal to". Section 7.2.1 A client MUST NOT send a Retry Delay DSO request message or DSO unacknowledged message to a server. [...] nit: it must not send it as a response, either, so perhaps "MUST NOT send a Retry Delay DSO message to a server" is shorter and better. Section 9.3 I thought IANA liked to see a "registration template" for what subsequent registrations in a registry being created will need to specify. (That said, the IANA state is "IANA OK - Actions Needed", and one might expect that they know better than I do...) Section 10 I'm a little surprised to not see some discussion that this mechanism encourages the maintenance of persistent connections on DNS servers, which encourages the maintenance of persistent connections on DNS servers, which has impact on resource consumption/load, but is not expected to be problematic because the server can tell the clients to go away if needed.
1) In addition to the bullet point in the 6.2 that was flagged by Spencer, I would like to discuss the content of section 5.4. (DSO Response Generation). I understand the desire to optimize for the case where the application knows that no data will be sent as reply to a certain message, however, TCP does not have a notion of message boundaries and therefore cannot and should not act based on the reception of a certain message. Indicating to the TCP that an ACK can be set immediately in an specific situation is also problematic as ACK processing is part of the TCP's internal machinery. However, why it is important at all that an TCP-level ACK is send out fast than the delayed ACK timer? The ACK receiver does not expose the information when an ACK is received to the application and the delayed ACK timer only expires if no further data is received/send by the ACK-receiver, therefore this optimization should not have any impact in the application performance. I would just recommend to remove this section and any additional discussion about delayed ACKs. Please note that the problem described in [NagleDA] only occurs for request-response protocols where no further request can be sent before the response is received. This is not the case in this protocol (as pipelining is supported). 2) Further regarding keep-alives: in sec 6.5.2: "For example, a hypothetical keepalive interval value of 100ms would result in a continuous stream of at least ten messages per second, in both directions, to keep the DSO Session alive." This does not seems correct. There should be at max one keep-alives message in flight. Thus the keep-laives timer should only be restarted after the keep-alive reply was received. Also: "And, in this extreme example, a single packet loss and retransmission over a long path could introduce a momentary pause in the stream of messages, long enough to cause the server to overzealously abort the connection." This doesn't really make sense to me: As I said, TCP will retransmit and the keep-alive timer should not be running until the reply is received. If you want to abort the connection based on keep-alives quickly before the TCP connection indicates you a failure, you need to wait at minimum for an interval that is larger than the TCP RTO (with is uaually 3 RTTs) which means you basically need to know the RTT. Also sec 7.1: "If the client does not generate the mandated keepalive traffic, then after twice this interval the server will forcibly abort the connection." Why must the server terminate the connection at all if the client refuses to send keep-alives? Isn't that what the inactivity timer is meant for? Usually only the endpoint that initiates the keep-alive should terminate the connection if no response is received. 3) There is another contraction regarding the inactive timer: Sec 6.2 say "A shorter inactivity timeout with a longer keepalive interval signals to the client that it should not speculatively keep an inactive DSO Session open for very long without reason, but when it does have an active reason to keep a DSO Session open, it doesn't need to be sending an aggressive level of keepalive traffic to maintain that session." which indicates that the client may leave the session open longer than indicated by the inactive timer of the server. However section 7.1.1 say that the client MUST close the connection when the timer is expired.
1) sec 3: I really find it a bit strange that there is normative language about error handling (as well as in the "same service instance" definition part) in the terminology section. Maybe move those paragraphs somewhere else...? Also the part about "long-lived operations" and messages types provides far more information than just terminology and I would recommend to also move it into an own section or maybe just have it as part of the intro. 2) Maybe call section 5 "Protocol specification" instead of "Protocol details"...? 3) Sec 5.1: "DSO messages MUST be carried in only protocols and in environments where a session may be established according to the definition given above in the Terminology section (Section 3)." I don't get this. Which part of section 3? Given section 3 is on terminology and this is a normative statement, I would recommend to spell out here explicitly what is meant. Do you mean the protocol must be connection-oriented, reliable, and providing in-order delivery? Any thing else? However, given that you say two paragraphs onwards that this spec is only applicable for the use with TCP and TLS/TCP, do you even need to specify these requirements normatively? 4) sec 5.1 "It is a common convention that protocols specified to run over TLS are given IANA service type names ending in "-tls"." Not sure this is true. Isn't it usually just an "s" at the end? Or with registry are you talking about? 5) sec 5.1: "In some environments it may be known in advance by external means that both client and server support DSO, ..." I guess the client and server also need to know if TLS is supported or not. Maybe spell this out as well. 6) sec 5.1: "... therefore either client or server may be the initiator of a message." Maybe s/initiator of a message/initiator of a message exchange/ 7) sec 5.1.2: "Having initiated a connection to a server, possibly using zero round- trip TCP Fast Open and/or zero round-trip TLS 1.3, a client MAY send multiple response-requiring DSO request messages to the server in succession without having to wait for a response to the first request message to confirm successful establishment of a DSO session." Why is the ability to send more than one request related to TCP Fast Open/TLS1.3 0-RTT? These are two independent mechanisms to speed up processing. Mentioning TCP Fast Open/TLS1.3 0-RTT here is rather confusing. Respectively I also don't think that the sentence: "Similarly, DSO supports zero round-trip operation." is describing quite the same. 8) Further please provide references to TCP Fast Open and TLS1.3 and maybe rephrase this paragraph to use normative language: "Caution must be taken to ensure that DSO messages sent before the first round-trip is completed are idempotent, or are otherwise immune to any problems that could be result from the inadvertent replay that can occur with zero round-trip operation." Maybe just: "DSO messages sent with TLS1.3 0-RTT before the TLS handshake is completed or in TCP SYN data with use of TCP Fast Open MUST be idempotent." However, this is actually already required by TLS1.3 and TFO, so there is after all no need to just rephrase this requirement here (at least not normatively). I think it would be more useful for every DSO message type to specify if it can be sent in 0-RTT or not and require this for specification of future TLVs. 9) sec 5.1.3: "In cases where a DSO session is terminated on one side of a middlebox, and then some session is opened on the other side of the middlebox in order to satisfy requests sent over the first DSO session, any such session MUST be treated as a separate session." This sentence seems a bit non-sensical, which probably isn't great for a normative sentence. If a session is terminated and open of the other end, doesn't that mean that you have two sessions? 10) sec 5.1.3: "A middlebox that is not doing a strict pass-through will have no way to know on which connection to forward a DSO message, and therefore will not be able to behave incorrectly." I'm not sure I understand this sentence. Can you clarify? 11) As already briefly mentioned by Ben, there is quite some redundant text in sec 5 (with 5.2) for handling of message IDs and TLVs. Given this text is normative, I would really recommend to only specify it clearly once. Please also check the rest of the doc further things that are specified normatively multiple times. It usually makes it must clearer to specify it only once, at least normatively, at the appropriate position in the doc. 12) sec 5.3.1: "When a DSO unacknowledged message is unsuccessful for some reason, .." What does unsuccessful mean here? Can you clarify? 13) sec 6.5.2: "A corporate DNS server that knows it is serving only clients on the internal network, with no intervening NAT gateways or firewalls, can impose a higher keepalive interval, because frequent keepalive traffic is not required." I guess in this scenario it is probably most appropriate to not send any keep-alives… 14) sec 6.6: " o The server application software terminates unexpectedly (perhaps due to a bug that makes it crash)." This bullet point does not really make sense to me because at that time when the app is crashed there is no way for the server anymore to perform any actions. 15) sec 7.1: "When a client is sending its second and subsequent Keepalive DSO requests to the server, the client SHOULD continue to request its preferred values each time. " I don't understand the SHOULD here.. what else should be client put in these field instead...? 16) sec 7.1.2: "Once a DSO Session has been established, if either client or server receives a DNS message over the DSO Session that contains an EDNS(0) TCP Keepalive option, this is a fatal error and the receiver of the EDNS(0) TCP Keepalive option MUST forcibly abort the connection immediately." This is normatively specified multiple time (3?) in the doc. Please consider to only specify it once where most appropriate (probably section 7.1.2) 16) sec 7.1: "The Keepalive TLV is not used as an Additional TLV." This is redundant with the normative sentence in the next paragraph. Maybe just remove this sentence...? 17) +1 to Ben's discuss regarding the reconnection of clients. A TCP RST can be sent for many reasons and waiting for an hour seems not appropriate. I would rather recommend to log an error and directly try to reconnect.
Thanks for addressing my Discuss, and for considering my comments. I'll stick the Discuss text here for ease of archaeology. --- previous Discuss I really like this document, and think it's headed the right direction. Of course I have four pages of comments, because reasons, but the only part I'm really confused about is this one ... I would have thought that if you end up with a different endpoint because your anycast address now resolves differently, the new endpoint would have to have shared a lot of state with the previous endpoint, for this to work: When an anycast service is configured on a particular IP address and port, it must be the case that although there is more than one physical server responding on that IP address, each such server can be treated as equivalent. If a change in network topology causes packets in a particular TCP connection to be sent to an anycast server instance that does not know about the connection, the normal keepalive and TCP connection timeout process will allow for recovery. What I would have expected to happen, is that the new endpoint sees a packet arrive that's not on a synchronized TCP connection, and immediately responds with a RST (reset), rather than the normal keepalive and TCP connection timeout process happening. That's also the way I'm reading https://tools.ietf.org/html/rfc7828#section-3.6. Is that not the way it's working for anycast these days? --- end of previous Discuss Everything else is a comment, so non-blocking, and please do the right thing. This is a nit, and your answer could be "no", and that's fine, but in some places this document uses "DSO keepalive", and in other places, "keepalive" with no qualifier. It's likely that less confusion would result if you could consistently call this "DSO keepalive", so that it is clearly NOT a TCP keepalive. Do the right thing, of course. Is the expectation that DSO would also be used in DNS over HTTP? I'm reading At the time of publication, DSO is specified only for DNS over TCP [RFC1035] [RFC7766], and for DNS over TLS over TCP [RFC7858]. Any use of DSO over some other connection technology needs to be specified in an appropriate future document. and noticing that https://tools.ietf.org/html/draft-ietf-doh-dns-over-https-12 is currently in IETF Last Call. This next one is well within the "Spencer wouldn't have done it this way, but Spencer's not the working group, or the IETF" range, but However, in the typical case a server will not know in advance whether a client supports DSO, so in general, unless it is known in advance by other means that a client does support DSO, a server MUST NOT initiate DSO request messages or DSO unacknowledged messages until a DSO Session has been mutually established by at least one successful DSO request/response exchange initiated by the client, as described below. Similarly, unless it is known in advance by other means that a server does support DSO, a client MUST NOT initiate DSO unacknowledged messages until after a DSO Session has been mutually established. seems fragile, especially in environments where clients can come and go, and servers may be addressed using anycast (so I knew in advance that the four servers at that anycast address supported DSO, but somebody installed a fifth server that does not). Is that unlikely to be a problem? I'm sure A single server may support multiple services, including DNS Updates [RFC2136], DNS Push Notifications [I-D.ietf-dnssd-push], and other services, for one or more DNS zones. When a client discovers that the target server for several different operations is the same target hostname and port, the client SHOULD use a single shared DSO Session for all those operations. A client SHOULD NOT open multiple connections to the same target host and port just because the names being operated on are different or happen to fall within different zones. This requirement is to reduce unnecessary connection load on the DNS server. is correct from the server side, but perhaps it's also worth noting that using multiple TCP connections unnecessarily increases the chances that data transfers happen during TCP slow start. If only one or two packets are being exchanged, that doesn't matter, but as more packets are exchanged, the difference increases, because congestion windows will grow more rapidly if fewer connections are used. I appreciate the inclusion of 5.4. DSO Response Generation But I've gotta ask. In the last paragraph of that section, I see o Use a networking API that lets the receiver signal to the TCP implementation that the receiver has received and processed a client request for which it will not be generating any immediate response. This allows the TCP implementation to operate efficiently in both cases; for requests that generate a response, the TCP ACK, window update, and DSO response are transmitted together in a single TCP segment, and for requests that do not generate a response, the application-layer software informs the TCP implementation that it should go ahead and send the TCP ACK and window update immediately, without waiting for the Delayed ACK timer. Unfortunately it is not known at this time which (if any) of the widely-available networking APIs currently include this capability. I would love to know if there are any widely-available network APIs that include this capability, before including this text in a standards-track RFC. Do you need help chasing this down? The text in 6.1. DSO Session Initiation seems rough to me, for a couple of reasons. The client may perform as many DNS operations as it wishes using the newly created DSO Session. Operations SHOULD be pipelined (i.e., the I don't understand why this would be a SHOULD. At least from the client's perspective, it's not needed for interoperation. client doesn't need wait for a response before sending the next message). The server MUST act on messages in the order they are transmitted, but responses to those messages SHOULD be sent out of order when appropriate. Is it correct to say that "responses to those messages SHOULD be sent when they become available, even if the responses are sent out of order"? If not, I'm probably missing what "when appropriate" means. I'm a bit mystified by this text in 6.2. DSO Session Timeouts In the usual case where the inactivity timeout is shorter than the keepalive interval, it is only when a client has a very long-lived, low-traffic, operation that the keepalive interval comes into play, to ensure that a sufficient residual amount of traffic is generated to maintain NAT and firewall state and to assure client and server that they still have connectivity to each other. I think the basics are correct - the inactivity timer and (DSO) keepalive interval are independent - but I'm struggling to think of a reason to send (DSO) keepalives that's NOT tied to maintaining NAT/firewall state, and there's a lot of text before the paragraph that mentions NAT/firewall, that talks about why either interval might be longer or shorter than the other, without considering NAT/firewall. Am I missing something here? ... and, now that I keep reading, 6.5.2. Values for the Keepalive Interval does a much better job of explaining how a (DSO) keepalive interval should be selected - I think you could reasonably delete most of the text about (DSO) keepalive intervals in section 6.2, and at most provide a forward pointer to 6.5.2. (As an aside, I think you probably want to cite https://tools.ietf.org/html/bcp142 as the operative recommendation for NAT behaviour toward TCP, since https://tools.ietf.org/html/rfc5382 has been updated) I found this text For long-lived DNS Stateful operations (such as a Push Notification subscription [I-D.ietf-dnssd-push] or a Discovery Relay interface subscription [I-D.ietf-dnssd-mdns-relay]), an operation is considered in progress for as long as the operation is active, until it is cancelled. This means that a DSO Session can exist, with active operations, with no messages flowing in either direction, for far longer than the inactivity timeout, and this is not an error. This is why there are two separate timers: the inactivity timeout, and the keepalive interval. Just because a DSO Session has no traffic for an extended period of time does not automatically make that DSO Session "inactive", if it has an active operation that is awaiting events. to be extremely helpful, but it's 28 pages into the document. Is there a place earlier in the document that describes these timers, where you could place this text? Maybe section 3/Terminology isn't the right place, but maybe there is a right place toward the front of the document. I'm not understanding why the SHOULDs are not MUSTs in this text: If, at any time during the life of the DSO Session, twice the inactivity timeout value (i.e., 30 seconds by default), or five seconds, if twice the inactivity timeout value is less than five seconds, elapses without there being any operation active on the DSO Session, the server SHOULD consider the client delinquent, and SHOULD forcibly abort the DSO Session. Perhaps part of my confusion is that I'm not sure what it means to "consider the client delinquent", but NOT to "forcibly abort the DSO session". But there are several "will forcibly abort"s in section 6.4.2, that sound more like MUST than SHOULD. I don't think the MUST NOT in Normally a server MUST NOT close a DSO Session with a client. A server only causes a DSO Session to be ended in the exceptional circumstances outlined below. is quite right. Given that you have a bulleted list of reasons why a server would violate the MUST not immediately following this sentence, you might want to say "Normally a server does not close" here.
What is the status of running code for this? Are there any known and interoperable implementations? This is in the context of IETF 102 plenary discussion on implementations and interoperability.
Substantive Comments: §5.1," If the RCODE is set to any value other than NOERROR (0) or DSOTYPENI ([TBA2] tentatively 11), then the client MUST assume that the server does not implement DSO at all. In this case the client is permitted to continue sending DNS messages on that connection, but the client SHOULD NOT issue further DSO messages on that connection." Why is the SHOULD NOT not MUST NOT? Do you envision situations where it might make sense to violate the SHOULD NOT? §5.1.2: Are there security considerations for using zero round trip handshakes? §5.1.3: "In cases where a DSO session is terminated on one side of a middlebox, and then some session is opened on the other side of the middlebox in order to satisfy requests sent over the first DSO session, any such session MUST be treated as a separate session." By what? How would the ultimate endpoints know? §126.96.36.199: " If DSO unacknowledged message is received containing an unrecognized Primary TLV, with a zero MESSAGE ID (indicating that no response is expected), then this is a fatal error and the recipient MUST forcibly abort the connection immediately." Doesn't that make extensibility difficult? What if an extension adds a new unacknowledged message type that uses a new primary TLV? §6.1: "The server MUST act on messages in the order they are transmitted, but responses to those messages SHOULD be sent out of order when appropriate." The SHOULD seems more like a MAY, unless you mean for implementors to go looking for reasons to do things out of order. §6.4.1: Does "consider delinquent" entail any concrete actions beyond resetting the connection? §12.2: It seems like TLS1.3 should be a normative reference, given that it's used to describe the condition for a normative statement. Editorial Comments: - General: Please watch for comma splices. §1: -- " It is likely that future updates to these tools will add the ability to recognize, decode, and display the DSO data." That sentence may not age well. -- " A goal of this approach is to avoid the operational issues that have befallen EDNS(0), particularly relating to middlebox behaviour." Is there something that can be cited to describe the operational issues? §2: There's a fair amount of procedure, including normative statements, described in the terminology section. That would better reserved to the sections that are more about procedure. Some readers only use terminology sections to look up terms on demand; such users may miss the procedure bits. §5.1, "DSO messages MUST be carried in only protocols " "MUST ... only" constructions can be ambiguous. Consider reformulating into a "MUST NOT" construction. §5.1.3: -- 3rd paragraph: Should "stateless" be "stateful"? -- "will have no way to know on which connection to forward a DSO message, and therefore will not be able to behave incorrectly." That seems like famous last words :-) §5.2.3, first paragraph: The MUST NOT seems more like a statement of fact. §5.3: The section has a number of redundant normative keywords. Please consider stating them authoritatively in one place, and making the others descriptive
Thank you for this document, I find it to be well written and useful. One small nit: In Section 9.3: Requests to register additional new DSO Type Codes in the "Unassigned" range 0040-F7FF are to be recorded by IANA after Expert Review [RFC8126]. At the time of publication of this document, the Designated Expert for the newly created DSO Type Code registry is [*TBD*]. The last sentence is not going to age well (even though it is "TBD"). I think it should be removed from the document and such instructions should be sent separately to IANA in the IESG ballot.
Rich version of this review at: https://mozphab-ietf.devsvcdev.mozaws.net/D4358 IMPORTANT S 5.3. > field set to zero, and MUST NOT elicit a response. > > Every DSO request message (QR=0) with a nonzero MESSAGE ID field is > an acknowledged DSO request, and MUST elicit a corresponding response > (QR=1), which MUST have the same MESSAGE ID in the DNS message header > as in the corresponding request. How do I handle duplicate message IDs on the responder? Did I miss where you said this? Is this just an error? S 9.3. > > Requests to register additional new DSO Type Codes in the > "Unassigned" range 0040-F7FF are to be recorded by IANA after Expert > Review [RFC8126]. At the time of publication of this document, the > Designated Expert for the newly created DSO Type Code registry is > [*TBD*]. What is the standard for the expert to follow COMMENTS S 1. > is appended to the end of the DNS message header. When displayed > using packet analyzer tools that have not been updated to recognize > the DSO format, this will result in the DSO data being displayed as > unknown additional data after the end of the DNS message. It is > likely that future updates to these tools will add the ability to > recognize, decode, and display the DSO data. I'm sure you will get to this soon, but what are the backward compatibility implications for the two endpoints. S 3. > The unqualified term "session" in the context of this document means > the exchange of DNS messages over a connection where: > > o The connection between client and server is persistent and > relatively long-lived (i.e., minutes or hours, rather than > seconds). This is a surprising taxonomy. I would assume that some of the options you are proposing would be relevant with a 30s connection (very long by HTTP standards!) S 3. > Where this specification says, "close gracefully," that means sending > a TLS close_notify (if TLS is in use) followed by a TCP FIN, or the > equivalents for other protocols. Where this specification requires a > connection to be closed gracefully, the requirement to initiate that > graceful close is placed on the client, to place the burden of TCP's > TIME-WAIT state on the client rather than the server. Does this mean that the server will ask the client to close? S 3. > connection to be closed gracefully, the requirement to initiate that > graceful close is placed on the client, to place the burden of TCP's > TIME-WAIT state on the client rather than the server. > > Where this specification says, "forcibly abort," that means sending a > TCP RST, or the equivalent for other protocols. In the BSD Sockets Because you bother to mention TLS above, what about non-close_notify TLS alerts? S 3. > the server's listening socket. > > The terms "initiator" and "responder" correspond respectively to the > initial sender and subsequent receiver of a DSO request message or > unacknowledged message, regardless of which was the "client" and > "server" in the usual DNS sense. Might be helpful to say earlier that this is a request/response protocol S 3. > > DNS Stateful Operations uses three kinds of message: "DSO request > messages", "DSO response messages", and "DSO unacknowledged > messages". A DSO request message elicits a DSO response message. > DSO unacknowledged messages are unidirectional messages and do not > generate any response. This would be useful further up. S 5.1. > > DNS over plain UDP [RFC0768] is not appropriate since it fails on the > requirement for in-order message delivery, and, in the presence of > NAT gateways and firewalls with short UDP timeouts, it fails to > provide a persistent bi-directional communication channel unless an > excessive amount of keepalive traffic is used. Note that this is going to make things not work super-well with DNS- over-QUIC unless you use one stream only. S 5.1. > > If the RCODE is set to any value other than NOERROR (0) or DSOTYPENI > ([TBA2] tentatively 11), then the client MUST assume that the server > does not implement DSO at all. In this case the client is permitted > to continue sending DNS messages on that connection, but the client > SHOULD NOT issue further DSO messages on that connection. Why is this a SHOULD and not a MUST? S 5.1.3. > to any problems that could be result from the inadvertent replay that > can occur with zero round-trip operation. > > 5.1.3. Middlebox Considerations > > Where an application-layer middlebox (e.g., a DNS proxy, forwarder, I'm having trouble with this section. Is it a set of requirements on middleboxes or statements of fact? If the latter, it seems like there are a bunch of ways for middleboxes to mess things up, S 5.4. > generate a response, the application-layer software informs the > TCP implementation that it should go ahead and send the TCP ACK > and window update immediately, without waiting for the Delayed ACK > timer. Unfortunately it is not known at this time which (if any) > of the widely-available networking APIs currently include this > capability. Are you going to make a recommendation here? S 188.8.131.52. > client a Retry Delay message, or by forcibly aborting the underlying > transport connection) the client SHOULD try to reconnect, to that > service instance, or to another suitable service instance, if more > than one is available. If reconnecting to the same service instance, > the client MUST respect the indicated delay, if available, before > attempting to reconnect. Do you want to recommend some sort of randomness around this value to avoid avalanche? S 8.2. > The table below indicates, for each of the three TLVs defined in this > document, whether they are valid in each of ten different contexts. > > The first five contexts are requests or unacknowledged messages from > client to server, and the corresponding responses from server back to > client: Nit. This text is a tiny bit hard to read, because you don't list S-P, etc. S 10. > messages are subject to the same constraints as any other DNS-over- > TLS messages and MUST NOT be sent in the clear before the TLS session > is established. > > The data field of the "Encryption Padding" TLV could be used as a > covert channel. Why not require this to be 0, then?