Summary: Has a DISCUSS. Has enough positions to pass once DISCUSS positions are resolved.
== Section 1 == "At the time of writing, almost all this DNS traffic is currently sent in clear (i.e., unencrypted). However there is increasing deployment of DNS-over-TLS (DoT) [RFC7858] and DNS-over-HTTPS (DoH) [RFC8484], particularly in mobile devices, browsers, and by providers of anycast recursive DNS resolution services. There are a few cases where there is some alternative channel encryption, for instance, in an IPsec VPN tunnel, at least between the stub resolver and the resolver. Today, almost all DNS queries are sent over UDP [thomas-ditl-tcp]. This has practical consequences when considering encryption of the traffic as a possible privacy technique. Some encryption solutions are only designed for TCP, not UDP and new solutions are still emerging [I-D.ietf-quic-transport] [I-D.huitema-quic-dnsoquic]." This text made me wonder about the value of publishing this bis document at this point in time. Things are evolving so rapidly that, with respect to several of the new parts of this document (e.g., the last few paragraphs of Sec. 6.1.1, Sec. 126.96.36.199, Sec. 188.8.131.52), an immutable summary designed to represent reality over the long term doesn't really seem feasible right now. Why not wait to see how QUIC, DOH, ADD, ODNS, etc. shake out in the next few years and take this up then? == Section 184.108.40.206 == "Users will only be aware of and have the ability to control such settings if applications provide the following functions: o communicate clearly the change in default to users o provide configuration options to change the default o provide configuration options to always use the system resolver" This doesn't seem true. If the third bullet isn't provided, users still have awareness and control. Also, the bullets seem redundant with the text above, as if this is saying "users only have awareness and control if they have awareness and control." As a result I'm not sure what this text is really meant to convey.
Please remove uses of "us" and "we," given that this is a consensus document. "Privacy policies for these servers may or may not be available and users need to be aware that privacy guarantees will vary with network." --> This is unrealistic as almost no users understand any of this. I'd recommend removing this. Section 6.1.3: The title says "Blocking of User Selected DNS Resolution Services" but the text is actually broader than that and applies to blocking of resolution services whether or not they are user-selected. I would suggest changing the title. Section 220.127.116.11: "Privacy focused users should be aware of the potential for additional client identifiers in DoH compared to DoT and may want to only use DoH client implementations that provide clear guidance on what identifiers they add." Again this seems really unrealistic for the majority of users who have no idea what any of this means.
[[ questions ]] [ section 18.104.22.168 ] * Does "Strict DoT" have a definition somewhere? I couldn't find one in 8499 nor in 7858. [[ nits ]] [ section 1 ] * "sent in clear", consider perhaps: "sent in the clear" [ section 4.1 ] * "those transaction" -> "those transactions" [ section 6.1.1 ] * "to limited subset" -> "to a limited subset" [ section 6.1.3 ] * "know to be used" -> "known to be used"
[ Thank you for addressing my DISCUSS point. ] [Edit: I accidentally hit "Send" too early; I have another few comments, also non-blocking: 1: "Also, sometimes, the QNAME embeds the software one uses, which could be a privacy issue. For instance, _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.example.org."... Unless you are a Microsoft or DNS weenie, this is likely not at clear -- what is being leaked here? The fact that the site uses TCP? LDAP? Windows? Goldbach's Conjecture? Example software? (I think adding a sentence here would be helpful...) ] Thank you for this document - it's really useful, and readable as well. I do have a few small comments to (possibly) make it even better - I will in no way be offended if you ignore these... The background on how DNS works is nicely written, and I'm to point people at it when I need to explain how the DNS works -- but I think a better name example than: "What are the SRV records of _xmpp-server._tcp.example.com?" would be good -- SRV is an unusual record type, and names with underscores surprise people. I'd instead suggest "What is the MX records for example.com" or "What is the A record for ftp.example.com?" -- I'm only mentioning this because the rest of the section is a very general introduction and this might confuse newcomers... "At the time of writing, almost all this DNS traffic is currently sent in clear (i.e., unencrypted). However there is increasing deployment of DNS-over-TLS (DoT) [RFC7858] and DNS-over-HTTPS (DoH) [RFC8484], particularly in mobile devices, browsers, and by providers of anycast recursive DNS resolution services." I think that you might want to remove the "particularly in ..." - I suspect that it will not age well; the document does say "At the time of writing" and "increasing", etc., but this document is likely foundational enough that it will still be referenced many many years from now, and this text may just cloud matters then. Whatever the case, thanks again for this document!
Thank you for responding to the SECDIR review (and thank you to Stephen Farrell for performing the SECDIR review). ** Section 22.214.171.124. Per “These resolvers may have strong, medium, or weak privacy policies …”, what are the dimensions of this Likert-style scale? I recommend a simpler sentence -- “… may have varied privacy policies”. ** Section 6.1.1. Per “All major OS's expose the system DNS settings and allow users to manually override them if desired”, agreed. However, in managed environments, users may not be able to manually override these settings. ** Section 6.1.3. Per “User privacy can also be at risk if there is blocking (by local network operators or more general mechanisms) …”, what is a “more general mechanism”? Also, "local network operator" describes who is doing the blocking and "general mechanisms" seems to be describing a technique. ** Section 8. Editorial. Per “They are used for many reasons – some good, some bad.”, I’d recommend against making judgements and stick to a rubric of operational practices and attacker behavior (say RFC7258). I’m not sure this sentence is needed. Editorial nits -- ** Section 6.1.1. Editorial. s/additionally highly dependent/highly dependent/ -- Section 12. Typo. s/apprecriated/appreciated/
Thank you for this document; it is important and I learned a few things.
Section 1 At the time of writing, almost all this DNS traffic is currently sent in clear (i.e., unencrypted). However there is increasing deployment nit: I think that "in the clear" is the term of art (add "the"). Today, almost all DNS queries are sent over UDP [thomas-ditl-tcp]. It looks like (https://mailarchive.ietf.org/arch/msg/dns-privacy/1pZL1FA57hzE1e09mQ2HMg2aWYY/) Sara was going to follow up with the DITL authors to try and ascertain whether "almost all queries" is still accurate for the "UDP" aspect, though the IETF mailarchive search doesn't seem to find any more recent traffic on that topic. Do we know if anyone actually heard back about this (or the "sent in [the] clear" a few lines previously)? I do not pretend to have the expertise needed to judge how the changes deployed by major browser affect the statistics for "all DNS traffic" (which presumably includes both stub-to-resolver and resolver-to-authoritative). This has practical consequences when considering encryption of the traffic as a possible privacy technique. Some encryption solutions are only designed for TCP, not UDP and new solutions are still emerging [I-D.ietf-quic-transport] [I-D.huitema-quic-dnsoquic]. [It looks like dnsoquic became draft-huitema-dprive-dnsoquic.] Section 3 multiple dynamic contexts of each device. This document does not attempt such a complex analysis, instead it presents an overview of the various considerations that could form the basis of such an analysis. nit: looks like a comma splice. Section 4.1 authentication or authorization of the client (resolver). Due to the lack of search capabilities, only a given QNAME will reveal the resource records associated with that name (or that name's non- existence). In other words: one needs to know what to ask for, in I agree with Warren that this statement ("only [...] will reveal [...] or that name's non-existence") is overly strong. Section 4.2 The DNS request includes many fields, but two of them seem particularly relevant for the privacy issues: the QNAME and the source IP address. "source IP address" is used in a loose sense of "source IP address + maybe source port number", because the port In other contexts I've seen this combination referred to as the "transport address". The QNAME is the full name sent by the user. It gives information about what the user does ("What are the MX records of example.net?" means he probably wants to send email to someone at example.net, which may be a domain used by only a few persons and is therefore very revealing about communication relationships). [...] (editorial) something like not-a-secret-cabal.example might make the example more visceral than example.net does. create more problems for the user. Also, sometimes, the QNAME embeds the software one uses, which could be a privacy issue. For instance, _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.example.org. (nit) I trust that this can be made into a complete sentence while addressing Warren's more-substantive comment. There are also some BitTorrent clients that query an SRV record for _bittorrent-tracker._tcp.domain.example. In a similar vein, I'm not sure what domain.example is supposed to represent here -- the domain of the author of the BitTorrent client? Therefore, all the issues and warnings about collection of IP addresses apply here. For the communication between the recursive I mostly assume that this is intended to be a reference to the generic concerns about "IP addresses are PII", etc., that one is ambiently exposed to by reading enough about the Internet. (There does not seem to be previous discussion of "collection of IP addresses" in this document, which would seem to indicate that it is not trying to refer back to previous text.) If so, perhaps an extra word or two would help ("all the standard issues and warnings", "all the generic issues and warnings", etc.) clarify the intent of the reference. However, hiding does not always work. Sometimes EDNS(0) Client subnet [RFC7871] is used (see its privacy analysis in [denis-edns-client-subnet]). [...] (nit) The wording here ("its privacy analysis") suggests that the referenced document is an authoritative/official IETF position, but it seems to be a blog post by a single individual. Using "one" or "a" rather than "its" would convey a less-authoritative connotation. In both cases, the IP address originating queries to the authoritative server is as sensitive as it is for HTTP [sidn-entrada]. I don't see how [sidn-entrada] supports the claim that end-user-adjacent DNS client IP addresses are equally sensitive as HTTP client IP Addresses; it mentions "sensitive" only twice (as "privacy-sensitive", admittedly, applying to such IP addresses, but as an assertion without justification) and "http" only in URLs (mostly in the references) and as an example request. It would feel more natural to use an IETF reference here, as well -- e.g., RFC 7624 discusses correlating client IP addresses with end users, RFC 7239 clearly covers privacy considerations for sending client IP addresses in the "forwarded" header field, and there are no doubt others -- though I do note the contents of the paragraph after this one. However, for both IPv4 and IPv6 addresses, it is important to note that source addresses are propagated with queries and comprise metadata about the host, user, or application that originated them. (This "propagated with queries" is still contingent on EDNS(0) Client Subnet from the previous paragraph, right?) Section 4.2.1 cache poisoning attacks by off-path attackers. It is noted, however, that they are designed to just verify IP addresses (and should change once a client's IP address changes), they are not designed to actively track users (like HTTP cookies). nit: comma splice. Section 5.1 not be. When other protocols will become more and more privacy-aware and secured against surveillance (e.g., [RFC8446], [I-D.ietf-quic-transport]), the use of unencrypted transports for DNS may become "the weakest link" in privacy. It is noted that at the time of writing there is on-going work attempting to encrypt the SNI in the TLS handshake [I-D.ietf-tls-sni-encryption]. This mention of encrypted "SNI" (now encrypted ClientHello) comes as a bit of a non sequitur. I suggest a bit of transition such as an additional clause at the end of the sentences ", which is one of the last remaning non-DNS cleartext identifiers of a connection target". (While the actual work itself has progressed to encrypting the entire ClientHello, I think it's okay to focus the exposition here on the SNI, as the relevant attribute.) It can be noted that if the user selects a single resolver with a small client population (even when using an encrypted transport) it can actually serve to aid tracking of that user as they move across network environment. I wonder if it is worth adding another clause at the end: ", and that an attacker in a position to observe the moving user is likely also able to observe the likely-unencrypted DNS queries from the resolver to the authoritative servers" Also, nit: "environments" plural. Section 5.2 Traffic analysis of unpadded encrypted traffic is also possible [pitfalls-of-dns-encryption] because the sizes and timing of encrypted DNS requests and responses can be correlated to unencrypted DNS requests upstream of a recursive resolver. We could (but don't have to) note that effective padding policies remain an open area of research. Section 126.96.36.199 o communicate clearly the change in default to users I think this is intending to say "when the default application resolver changes away from the system resolver", but the present text is perhaps a little unclear about what "the change" is referring to. Section 6.1.2 Even if encrypted DNS such as DoH or DoT is used, unless the client has been configured in a secure way with the server identity, an active attacker can impersonate the server. [...] More than the server identity is needed -- the credentials or trust anchor needed to authenticate a peer as that identity are also needed. Section 6.1.3 User privacy can also be at risk if there is blocking (by local network operators or more general mechanisms) of access to remote recursive servers that offer encrypted transports when the local resolver does not offer encryption and/or has very poor privacy policies. [...] I suggest adding "e.g." before "when the local resolver" to avoid giving the impression that this is an exhaustive list. This is a form of Rendezvous-Based Blocking as described in Section 4.3 of [RFC7754]. Such blocklists often include servers know to be used for malware, bots or other security risks. In order to prevent circumvention of their blocking policies, some networks also block access to resolvers with incompatible policies. Perhaps this is touching too much on the controversial topic, but it seems to me that the networks in question "attempt to block access"; whether or not they fully and reliably succeed at doing so is not clear. (See also the near-impossibility of closing covert channels in protocols.) It is also noted that attacks on remote resolver services, e.g., DDoS could force users to switch to other services that do not offer encrypted transports for DNS. nit: comma after DDoS. Section 188.8.131.52 Some implementations have, in fact, chosen to restrict the use of the 'User-Agent' header so that resolver operators cannot identify the specific application that is originating the DNS queries. With similar disclaimer as previously, perhaps "trivially identify"? There are other fingerprinting techniques possible even at, e.g., the TLS layer (that we discussed previously in this document!), which still apply to DoH. Section 6.2 This "protection", when using a large resolver with many clients, is no longer present if ECS [RFC7871] is used because, in this case, the authoritative name server sees the original IP address (or prefix, depending on the setup). (side note) this has always been a bit confusing to me -- ECS is "client subnet", not "client address", and I don't really understand why someone would set the prefix length to the full 128 (or 32) bits of the address. Is there really a lot of non-truncated client addresses being sent around like this? How did that happen? So, requests to a given ccTLD may go to servers managed by organizations outside of the ccTLD's country. End users may not anticipate that, when doing a security analysis. (Is this a "for example"? It seems plausibly relevant for non-cc TLDs as well.) Section 7.1 The IAB privacy and security program also have a work in progress [RFC7624] that considers such inference-based attacks in a more general framework. I do not really think the final RFC constitutes a "work in progress" anymore. Section 8 Passive DNS systems [passive-dns] allow reconstruction of the data of sometimes an entire zone. They are used for many reasons -- some good, some bad. Well-known passive DNS systems keep only the DNS responses, and not the source IP address of the client, precisely for privacy reasons. Other passive DNS systems may not be so careful. Perhaps not so well-intentioned, either... The revelations from the Edward Snowden documents, which were leaked from the National Security Agency (NSA) provide evidence of the use nit: comma after "(NSA)". Section 9 To our knowledge, there are no specific privacy laws for DNS data, in any country. Interpreting general privacy laws like [data-protection-directive] or GDPR  applicable in the European Union in the context of DNS traffic data is not an easy task, and we do not know a court precedent here. See an interesting analysis in [sidn-entrada]. This text is essentially unchanged since RFC 7626; did we do much of a search for whether the past five years have brought about changes in the legal landscape?
I'll add my thanks for this document. I have tripped on some of the issues in my experience, but some of the others described here were eye-opening. I'm also learning from the ensuing discussions. Section 1: A couple of nits: * "DNS relies on caching heavily ..." -- suggest "DNS relies heavily on caching ..." * "Both are a big privacy concern since ..." -- suggest "Both are big privacy concerns since ...", unless you mean the two of them collectively (in which case, please say so) I agree with Warren in that it's not clear what's leaking in the example at the bottom of the second paragraph of Section 4.2. In Section 5.1, please expand "CPE" on first use. I'm having trouble parsing the third paragraph of Section 5.2. The fourth paragraph in the same section needs some commas.
I have comments on the DISCUSS positions of Alissa and Warren, both of which I support to some extent: On Warren’s point, which I wouldn’t have made it a DISCUSS myself, I agree that editorial changes are warranted so as to make the point more clearly and with less baggage. I think we all know what the document means here, but not all readers will, and there’s sufficient FUD in this area that it behooves us to be very careful about how we say things. Avoiding things such as “alleged” and “it has long been claimed” is easy, would go a long way toward clarity and avoidance of feeding the FUD, and is worth a brief editing pass. I leave it to Warren to work the details out with the working group. On Alissa’s first point — why publish this update now, rather than waiting until more things shake out and settle down — I basically agree, though I’m torn between thinking that waiting is better... and, on the other hand, acknowledging that enough has already changed that it’s important to get the update out there, and that it can be updated again later. On her second point, I’ll go in a different direction: it’s bordering on silly to think that any real end user can be said to “be aware of and have the ability to control” anything related to DNS settings and resolution options. If “users” refers to those of us writing these specs, sure. But when we’re talking about our siblings and cousins and parents, who are doctors and nurses, chefs and bakers, bank tellers and car mechanics, there is no hope of awareness and understanding of the choices and their consequences, nor that any form of “communicate clearly” will really accomplish anything. I see little to recommend pretending that it will. So I, too, am not sure what this text is really meant to convey.
It would have been nice to include a narrative section indicating the differences with respect to rfc7626. Maybe turn the bullets in §14 into a short explanation of the major changes.
Hi, Thank you for this document. I found it interesting and easy to read. A few minor comments/nits that I spotted whilst reading this document: "in clear (i.e., unencrypted)." => "unencrypted." "However there is" => "However, there is" "designed for TCP, not UDP and new" => "designed for TCP, not UDP, and new" "It can be noted also that" => "It can also be noted that" "Both are a big privacy concern" => "Both are significant privacy concerns" "de-NAT DNS queries dns-de-nat " => "de-NAT DNS queries "? Regards, Rob