Ballot for draft-ietf-opsawg-mud-iot-dns-considerations

Discuss (2024-03-05 for -12) Sent

I unfortunately find this document very hard to understand. Overall, I think it would do better to split out the use cases. It seems to conflate or mix three distinct use cases: 1) A CPE with firewall+MUD-controller and an IoT MUD client, 2) A CPE with firewall with separate MUD controller and IoT MUD client, 3) An IoT device and a centralized enterprise MUD controller and centralized enterprise firewalling.
This then gets more complicated due to different assumptions of where the DNS resolver lives: On the CPE, on the LAN, in the Enterprise, on the public Internet, and what DNS protocols to use: port 53 or DoH/DoT/DoQ. After reading the document I do not find clear guidance on the MUD+DNS issue. On the contrary, I feel like this is impossible to deploy.

If the MUD controller and DNS resolver are not within the same CPE, it is unclear how communication should work to synchronize the required DNS lookups, results and firewall synchronisation between MUD controller, IoT device and firewall/router device. It seems a protocol needs to be specified but hasn't. Lots of talk about synchronizing the DNS servers to use and do independent DNS lookups seems very problematic.

I feel that MUD still needs to evolve first to be further specified as a technology, before a document about (remainig) DNS considerations can be published.

I've put my specific item feedback in the comments section below.

Comment (2024-03-05 for -12) Sent

In the Introduction the numbered sections are wrong (eg "second section", "third section")

An issue is raised about delays as a result of a cold cache. I'm not sure why
that matters. It is a few seconds delay that should only happen once every
couple of weeks ?

Section 3.1.1 does not take prefetching into account ?

By doing the DNS lookups when the traffic occurs, then a passive
attacker can see when the device is active

Isn't this always the case? No application splits the DNS lookup from using
the obtained IP by a large amount of time to counter traffic analyses ?
Although the app might cache the result and re-use long past the TTL time,
which is a problem if the MUD controller and firewall base any ACL addition
and removal on DNS TTLs.

This does not require access to all on-path data, just to the
DNS requests to the bottom level of the DNS tree.

I don't fully understand this? If query-minimalization is used, does this
still apply? I don't think so and if you care about DNS privacy, then surely
you use query-minimalization and perhaps a DoT/DoH to an external party on top?

Aside from the list of records being incomplete, the list may
have changed between the time that the MUD controller did the
lookup and the time that the IoT device did the lookup

Isn't the IoT device forced to use the firewall/MUD DNS Server? If not, there
is already a DNS extraction channel and MUD has lost already.

In order to compensate for this, the MUD controller SHOULD
regularly perform DNS lookups in order to never have stale data.

This will cause a lot of unused lookups to be forever refreshed. This seems bad
and with ephemeral redirections (eg 32217321835.pool.hue.philips.com) seem to
use up a lot of memory on the router for DNS and firewall rules.

it may be necessary to avoid local recursive resolvers.

Why? Shouldn't the IoT device be FORCED to use the local firewall/MUD
associated DNS server? (see earlier point)

The MUD controller SHOULD incorporate its own recursive caching DNS server.

What if the network firewalls all DNS except the allowed one? After all,
we want to protect the network by doing DNS filtering? If you think you
need the MUD controller to limit DNS queries it sends to the DNS recursive,
then perhaps the MUD application should honour TTL and not do repeated lookups?
If they would be outside of the TTL, then you have to do a real lookup against
the real DNS server anyway? But even so, a DNS caching server should be able to
easilly serve many queries that are coming from DNS cache.

These lookups must be rate limited to avoid excessive load on
the DNS servers, and it may be necessary to avoid local recursive
resolvers. The MUD controller SHOULD incorporate its own recursive
caching DNS server.

Wouldn't the local network now have those DNS entries constantly refreshed
in both the MUD controller and the local DNS cache (because every home
router has a DNS resolver with cache). It is often not possible to bypass
the DHCP obtained DNS servers because the router will block those to be
able to do content filtering and mailware protection based on DNS.

if the IoT device does a DNS lookup, it goes via the router's advertised
DNS. If this is not relayed to the MUD controller, it will never know
and it wont work so now there is another protocol needed to relay that
DNS from router to mud controller?

A MUD controller that is aware of which recursive DNS server
the IoT device will use can instead query that server on a
periodic basis.

That's a race condition waiting to happen though. It _could_ do a cache
snoop query (eg without RD=1) and it wouldn't trigger an upstream query.

Any geographic load balancing will base the decision on the
geolocation of the recursive DNS server, and the recursive name
server will provide the same answer to the MUD controller as to
the IoT device.

There is no guarantee this is true. As the document says earlier, many
services return load balanced answers or round robin answers.

The resulting name to IP address mapping in the recursive name
server will be cached, and will remain the same for the entire
advertised Time-To-Live reported in the DNS query return. This
also allows the MUD controller to avoid doing unnecessary queries.

An IoT device that does a DNS lookup and gets an answer with TTL=3600
isn't stopped from using its answer for weeks. It is not even against
any RFC if it can keep its TCP connection up for those weeks. The MUD
controller doing repeated DNS lookups isn't going to know which of
these answers is still in use or not.

Some text at the end of Section 3 finally describes two very different
use cases that the document should have started out with. The home network
and the Enterprise. It realizes these things only really work well when
inside a single CPE. But wont work for Enterprise deployments. But I find
the Enterprise case weak. Why would an Enterprise ever allow its IoT devices
to send packets to outside the Enterprise?

The first is that the update service server must decide whether
to provide an IPv4 or an IPv6 literal.

Why can's such a REST API not return a json struct with both?

A third problem involves the use of HTTPS

But it was talking about downloading a signed firmware blob. You don't
need HTTPS for that. The signature can detect unauthorized modifications.
If you want privacy, you can still do HTTPS to an IP and just not validate
the X.509 certificate. With TLS ephemeral keys that gets you privacy for
passive monitors.

A non-deterministic name or address that is returned within
the update protocol, the MUD controller is unable to know
what the name is. It is therefore unable to make sure that the
communication to retrieve the new firmware is permitted by the
MUD enforcement point.

Sure, but if the IoT device can tell the MUD controller which name it
needs for the firmware update, a compromised IoT device can tell it
the firmware is at evil.com and get to unauthorized places. The only
way to avoid this is for the IoT device to limit the domains allowed
statically from the signed mud profile, and not allow HTTP redirects or
random CDN redirects. If you allow that, you have again lost already.

Define what "geofenced names" are? Only existing locally? Within a LAN
or home or isp network?

Use of public resolvers instead of the provided DNS resolver,
whether Do53, DoQ, DoT or DoH is discouraged. Should the network
provide such a resolver for use, then there is no reason not to
use it, as the network operator has clearly thought about this.

This seems to contradict the earlier text. If the DHCP handed out 8.8.8.8,
then the IoT device and the MUD controller both using 8.8.8.8 could end up
on a different node of the ANYCAST cluster and thus get different replies
when there is round robin etc happening.

It is recommended that use of non-local resolvers is only done
when the locally provided resolvers provide no answers to any
queries at all, and do so repeatedly. The use of the operator
provided resolvers SHOULD be retried on a periodic basis, and
once they answer, there SHOULD be no further attempts to contact
public resolvers.

Assuming the recommendation is valid for MUD controller and IoT device,
there is again a race condition where they can end up using different
DNS servers and thus getting different answers and getting the wrong ACLs
installed. It really seems to me that more coordination is needed between
the MUD controller, the IoT device and the DNS server, and that this is
really only possible if the MUD controller is the firewall/router and
DNS server within a single CPE.

Finally, the list of public resolvers that might be contacted
MUST be listed in the MUD file as destinations that are to
be permitted! This should include the port numbers (i.e., 53,
853 for DoT, 443 for DoH) that will be used as well.

Doesn't this again open up a free channel for a compromised IoT device?
If it can reach 8.8.8.8 it can exfiltrate by sending arbitrary queries
to it. I would assume a MUD file would limit DNS queries to certain
domains but if the IoT device directly connects to 8.8.8.8 (and even worse,
over DoT or DoH), then all MUD protection has been bypassed.

Use of Encrypted DNS connection to a local DNS recursive resolver
is the preferred choice.

I would argue this is not always the preferred choice. Especially with
the ADD drafts allowing Delegated Credentials et all.

IoT devices that reach out to the manufacturer at regular
intervals to check for firmware updates are informing passive
eavesdroppers of the existence of a specific manufacturer's
device being present at the origin location.

While true, it is unavoidable and perhaps the responsibility of the
CPE to have a DoT/DoH upstream trusted server or to use a public
trusted one where being part of a client pool gives some limtied privacy
back. But regardless, I don't think it relates to MUD and is not a
consideration for MUD IoT.

Discuss (2024-03-05 for -12) Sent

** Section 7.
   The use of a publicly specified firmware update protocol would also
   enhance privacy of IoT devices.  In such a system, the IoT device
   would never contact the manufacturer for version information or for
   firmware itself.

Why does the use of a “publicly specified firmware update protocol” necessarily enhance privacy?  Do all such protocols have the properties described in the second sentence?

Comment (2024-03-05 for -12) Sent

Thank you to Chris Wood for the SECDIR review.

I support the DISCUSS position of Paul Wouters. Judging by the title, abstract, framing in Section 1 and the tradeoff presenting in Section 8, it isn’t clear if this guidance is for “all IoT devices” or “only IoT devices that support MUD”; and if this is intended for enterprise and home deployments. In the spirit of simplifying the adjudication of feedback, I tried my best not duplicate points from Paul's COMMENT ballot. Some of the points below are further examples of this scope confusion.

** Section 1.
In TLS 1.3, with or without the use of ECH, middleboxes cannot rely
on SNI inspection because malware could lie about the SNI.

Can’t malware also lie about the SNI in TLS 1.2?

** Section 2.
Although this document is not an IETF Standards Track publication, it
adopts the conventions for normative language to provide clarity of
instructions to the implementer.

Isn’t common for BCPs to use RFC2119 language.

** Section 3.1.3. Who are these service providers whose role it is to maintain reverse mappings relative to the actors I thought were in question – device manufacturer and owner/operator?

** Section 4.1 Editorial. Is it “4.1. Use of IP address literals inprotocol” or “in-protocol”?

** Section 4.1. Editorial.
(often over
HTTPS, sometimes with a POST, but the method is immaterial)

If is immaterial, why mention it?

** Section 4.1
The current firmware model of the device is sometimes provided and
then the authoritative server provides a determination if a new
version is required and, if so, what version.

What’s the authoritative server in this model? Is it the “vendor system” mentioned in the previous sentence?

** Section 4.1
The first is that it eliminates problems with firmware updates that
might be caused by lack of DNS, or incompatibilities with DNS.

I’m confused on what is the anti-pattern. It’s acceptable for the device to initial query the authoritative server via IP (I say acceptable because there isn’t guidance cautioning against it), but if the authoritative server responds back with an URI with an IP, this is a problem?

** Section 4.1
The first is that the update service server must decide whether to
provide an IPv4 or an IPv6 literal. A DNS name can contain both
kinds of addresses, and can also contain many different IP addresses
of each kind.

Couldn’t the update service know which IP address family it was contacted over and serve a response back in that family (in addition to including both addresses in the HTTP API)?

** Section 4.1
Finally, it is common in some content-distribution networks (CDN) to
use multiple layers of DNS CNAMEs in order to isolate the content-
owner's naming system from changes in how the distribution network is
organized.

Understood. Who is this a problem for? What if the vendor doesn’t use a CDN?

** Section 4.3. Who is this section directed at? Based on Section 4’s title of “DNS and IP Anti-Patterns for IoT device Manufacturers”, it seems like it is manufacturers. However, the text in this section seems to be discussing CDN behavior. Are manufacturers supposed to avoid CDNs that follow this behavior?

** Section 5. What is the actionable BCP or design guidance from this section?

** Section 6.
The difficult part is determining what to put into the MUD file
itself. There are currently tools that help with the definition and
analysis of MUD files, see [mudmaker]. The remaining difficulty is
now the actual list of expected connections to put in the MUD file.
An IoT manufacturer must now spend some time reviewing the network
communications by their device.

How is this germane to MUD and DNS?

** Section 6.5

Should the network provide
such a resolver for use, then there is no reason not to use it, as
the network operator has clearly thought about this.

Can more be said about the basis of this confidence. I can see the rationale in some enterprise scenario. Section 7 makes a case for the opposite advice -- “The use of unencrypted (Do53) requests to a local DNS server exposes the list to any internal passive eavesdroppers, and for some situations that may be significant, particularly if unencrypted Wi-Fi
is used.”

** Section 7. I found this Privacy Considerations lacking a basic explanation of the DNS-focused threat model. I think the start of that threat assessment is that “many IoT devices are automatically configured to connect to the public internet to enable automatic updates, send telemetry to the manufacturers, or enable integration with manufacturer or third-party services”.

Using the tradeoff template of the security considerations in Section 8, a privacy consideration trade-off might be that “device owners/operators want to leak as little onto the internet and to the device manufacturer while still getting the functionality of the IoT device”.

** Section 7.

IoT devices that reach out to the manufacturer at regular intervals
to check for firmware updates are informing passive eavesdroppers of
the existence of a specific manufacturer's device being present at
the origin location.

-- Is it common in an enterprise setting for IoT devices to be able to auto-updated themselves from firmware download off the internet? In my limited enterprise experience, other end-points and network device are typically managed. Is there some nuance that these devices can only be managed the manufacturer?

-- In an enterprise setting wouldn’t it be best practice to prevent devices from beaconing out to the internet with DNS blackholing or IP address filters?

** Section 7.
IoT device manufacturers are encouraged to find ways to anonymize
their update queries. For instance, contracting out the update
notification service to a third party that deals with a large variety
of devices would provide a level of defense against passive
eavesdropping.

This is good advice.

-- Is the DNS footprint of most IoT devices predominately queries for updates? To revisit the previous comment about the threat model, don’t some IoT devices use DNS to initiate traffic for more things than just update queries negating the benefit of a third-party update infrastructure?

-- Not knowing much about this is done in production, is this realistic guidance based on current IoT manufacturer practices? Collecting less data from device owner/operators seems to be opposite of the trends I have seen.

Comment (2024-03-02 for -12) Sent

# Internet AD comments for draft-ietf-opsawg-mud-iot-dns-considerations-12
CC @ekline

* comment syntax:
  - https://github.com/mnot/ietf-comments/blob/main/format.md

* "Handling Ballot Positions":
  - https://ietf.org/about/groups/iesg/statements/handling-ballot-positions/

## Comments

### S3

* "ip6.arpa", not "ipv6.arpa"

  This is correct elsewhere in the doc, but this seems to have been missed.

### S3.2

* "recursive servers should cache data for at least..."

  ... while still respecting TTLs in the replies, yes?

### S6.4

* I suggest finding some text to point to that defines what a "geofenced"
  name is.  Right now this feels like the kind of thing that everyone
  "just knows what it means", but could use some formal description.

## Nits

### S3.1

* s/mapping/mappings/?

### S4.1

* s/inprotocol/in-protocol/

### S4.2

* "all those addresses DNS for the the name" ->
  "all those addresses in the DNS for the name"

Comment (2024-03-06 for -12) Sent

Thanks for this document. Paul Wouters DISCUSS rings true for me. I do have one small comment. Probably this is just an editing mistake left over from some earlier revision. The last para of the Intro is:

```
   The Security Considerations section covers some of the negative
   outcomes should MUD/firewall managers and IoT manufacturers choose
   not to cooperate.
```

It doesn't, though. I guess either fix the SecCons to do what the Intro says, or change the Intro to accurately describe the SecCons.

Comment (2024-03-29) Sent

Thank you Dave for the IOTDIR, Nicolai for the DNSDIR, and Christopher for the SECDIR review.

I support Paul and Roman's DISCUSS, and all the comments posted by the reviewers.I will note that IOTDIR, DNSDIR, and SECDIR independently have raised issues that I hope the authors will work through when they publish the next version of the draft.

Comment (2024-03-06 for -12) Sent

I support Paul's DISCUSS position and many of his comments.

I understand why this is seeking BCP status, but I think it's unusual for something claiming to be "Considerations" to seek that status. I think this is more suited to Informational.

Please expand "ECH" on first use.

If Section 3.1 describes a "failing strategy", why is it only NOT RECOMMENDED?

In Section 3.2, what is a "physical ACL"?

Also, Section 3.2 seems to use a lot of space describing the benefits of DNS caching, TTLs, etc. Someone with a moderate understanding of DNS would already get all of this. I think it could use some editing down.

Section 4.1: I think "inprotocol" should be "in-protocol", although I don't know if that's a word either. I would use neither; it's fine without.

Also in Section 4.1, the final paragraph (or at least its first sentence) seems a bit mangled.

The title of Section 6.1 doesn't appear (to me) to match what it says.

For Section 6.4, can we define "geofenced" or provide a reference? This is the first time that term is used in this document.

For a BCP, Section 6.5 feels mushy. It says the best practice is (thing), but then buffers it with SHOULDs. I think you should say what the best practice is and stop. If someone elects to deviate, then they're not doing what the best practice is.

===

From Orie Steele, incoming ART Area Director:

In 4.2. Use of non-deterministic DNS names in-protocol

> Within that control protocol references are made to additional content at other URLs. The values of those URLs do not fit any easily described pattern and may point at arbitrary names.

Seems to rely on RFC9238 to define what constitutes a well formed URL, which in turn references RFC3986

https://www.rfc-editor.org/rfc/rfc3986#section-7.1

I believe this imposes some interoperability considerations regarding IDNA.

Some comments or guidance on what international domain names and URLs are acceptable might be useful, please consider a reference to https://datatracker.ietf.org/doc/html/rfc5895

Comment (2024-03-29) Not sent

I support Paul's DISCUSS position and many of his comments.

I also agree with Murray's comments, especially regarding Informational status possibly being a better choice.

Comment (2024-03-07 for -12) Sent

No objection from transport layer specific issues, however, this was not a easy read for me. It often convolutes process steps with practice, issues and recommendations, hence hard to follow.

I strongly support Paul's discuss points.

I have following comments/questions and I believe the document will be enriched if those are addressed:

- Abstract : it says -

This document details concerns about how Internet of Things devices use IP addresses and DNS names.

I am with the impression that these concerns are not for the entire community of IoT devices, rather for those uses MUD and wanted to use DNS. Also detailing only concerns does not seem the entire goal of this document. Why does the document start with such statement?

- Please define "antipattern" in this document. I understand it comes from an external source, any day that definition can change and the usage of "antipattern" in this document may become out of context. It is better to agree on what the "antipattern" means in the context of this document.

- Section 1 : This references to sections to describe particular things and that reference does not map to the section numbers of this document. I think there is not need to such calling out of sections in the introduction, it is confusing.

- Section 1 :

The third section of this document details how current trends in DNS resolution such as public DNS servers, DNS over TLS (DoT), DNS over QUIC (DoQ), and DNS over HTTPS (DoH) cause problems for the strategies employed.

Where can I find the promised details? DoQ is only mentioned once in later sections.

- Section 6:
- Please explain the geofenced name before providing recommendations for it.
- How should the manufacturers interpret "strong recommendation" ? Is there any particular reason not to use normative text here?

Comment (2024-03-05 for -12) Sent

# Éric Vyncke, INT AD, comments for draft-ietf-opsawg-mud-iot-dns-considerations-12

Thank you for the work put into this document. It is a nice piece of work, clear and easy to read.

Please find below some non-blocking COMMENT points (but replies would be appreciated even if only for my own education).

Special thanks to Henk Birkholz for the shepherd's detailed write-up including the WG consensus and the *very light* justification of the intended status.

Please note that Dave Thaler is the IoT directorate reviewer (at my request) and you may want to consider this iot-dir review as well when it will be available (no need to wait for it though):
https://datatracker.ietf.org/doc/draft-ietf-opsawg-mud-iot-dns-considerations/reviewrequest/19052/

You may also expect an Int-dir review as:
https://datatracker.ietf.org/doc/draft-ietf-opsawg-mud-iot-dns-considerations/reviewrequest/19051/ (not yet assigned though)

I hope that this review helps to improve the document,

Regards,

-éric

# COMMENTS (non-blocking)

## Absence of mDNS

Is mDNS used in the context of MUD ? If so, then it should be mentioned here.

## Abstract

Let's be positive s/This document details concerns /This document details considerations /

## Section 1

s/Some Enterprises do this already. /Some organisations do this already. / ? (e.g., governmental agencies, ...)

Suggest to put the sentence `The first section of this document...` on its own 1-sentence paragraph.

## Section 3

Suggest to always use "DNS names" rather than plain "names". Applicable in several places in the document.

Isn't the mapping from address to DNS names usually called "reverse mapping" ? E.g., section 3.1.3 uses `reverse names`.

## Section 3.1

Suggest to add "often" in the too assertive sentence `Attempts to map IP address to names in real time fails for a number of reasons`.

## Section 3.1.2

`They could determine when a home was occupied or not`: actually when I leave home to travel (e.g., to IETF-119) most of my IoT devices are still active as I want to 'control' my home from remote.

## Section 3.1.3

`Service providers` is rather vague in this context, is it the access/internet SP or a hosted-IoT-service ?

## Section 3.2

It seems indeed to be the most obvious technique. So obvious that it should be given a hint in the introduction.

Is there a common use case where the MUD controller is changing location ? I.e., then having different forward DNS resolution answers ? I would also expect the authoritative geo-sensitve servers will use a short DNS TTL in their answers.

## Section 4

Thanks for pointing me to "antipatterns", I learned something :-) OTOH, I had to follow the link to understand the paragraph :-(

## Section 4.3

Unsure whether using a real case with Amazon is useful here...

## Section 5

Whether the MUD devices and the MUD controllers use the same recursive resolver is probably orthogonal to the use of DoT/DoH.

## Section 6

AFAIK, LLDP can also be used per RFC 8520 in addition to DHCP to retrieve the MUD string.

## Section 6.5

The section title is about `Prefer DNS servers learnt from DHCP/Route Advertisements` but the text is only about DHCP.

Btw, the exact wording is "Route*r* Advertisement" and a reference to RFC 8106 could be useful.

Which are the reasons in `There are a number of reasons to avoid this` ?

## Section 7

`The use of non-local DNS servers exposes the list of names resolved to a third party` even if the recursive resolution is done 'locally' (i.e., on a CPE) it will leak the MUD requests, we could argue that using a non-local recursive resolver will only expose the requests to this non-local server but not to the actual authoritative server.

## References

Please note that DNR & DDR are published as RFC 9462 / 9463 (dated November 2023).

Yes (for -12) Unknown

No Objection (2024-03-05 for -12) Sent

(4.1) There's an editorial error here.

"An authoritative server might be tempted to provide an IP address literal inside the protocol: there are two arguments (anti-patterns) for doing this."

I'm expecting two reasons someone might use an IP literal.

"The first is that it eliminates problems with firmware updates that might be caused by lack of DNS..."

Yep, that tracks.

"The second reason to avoid a IP address literal in the URL is when an inhouse content-distribution system is involved..."

But this is making the opposite point! It appears that this section is actually presenting ONE (not two) reason to use IP literals, and then several reasons that's a bad idea. So say that!