Summary: Has a DISCUSS. Has enough positions to pass once DISCUSS positions are resolved.
(1) I am not entirely sure what we mean by saying that temporary addresses must have a lifetime that is "statistically different" across different addresses, and accordingly I am not sure that the procedures in Section 3.4+3.5 for rereshing a temporary address achieve that property. (The text about "statistically different" does not appear in RFC 4941, and the relevant parts of Section 3.4/3.5 are unchanged from RFC 4941, so this may be the result of an incomplete update.) Specifically, when Section 3.5 says to "[repeat] the actions described in Section 3.4, starting at step 4" that seems to (for long-lived PIOs) result in, e.g., the new temporary address having lifetime TEMP_VALID_LIFETIME starting at exactly the time when the previous one expired; wouldn't an observer be able to trivially correlate "new address showed up with TEMP_VALID_LIFETIME" with "address that expired at that time"? Note that the attacker does not need to know the value of TEMP_VALID_LIFETIME in order to perform a DFT on the distribution of "new address" events. (Furthermore, we apparently qualify the "repeating the actions" with some caveats, which doesn't exactly qualify as "repeating the actions" anymore. That said, the caveats currently listed in Section 3.5 don't seem to be enough to provide the "statistically different property" in what I believe to be the intended interpretation.) (2) Please fix the reference for DupAddrDetectTransmits in Section 3.8 -- it is defined in 4862, while RetransTimer is in 4861. (3) RFC 4941 cannot be a *normative* reference of this document if we are going to Obsolete it.
Section 1.2 Having followed many of the references from the Introduction, it seems that there could be an additional aspect to the problem statement, namely the question of whether an attacker can (statistically) determine whether or not there is a host at a given address/IID. When such an ability is present, techniques (e.g., pen-testing) involving scanning the entire address space become more feasible. I do not think this potential aspect needs to be mentioned, per se, but do not know if it was considered for inclusion or not. The correlation can be performed by o An attacker who is in the path between the node in question and the peer(s) to which it is communicating, and who can view the IPv6 addresses present in the datagrams. o An attacker who can access the communication logs of the peers with which the node has communicated. (side note) I suppose if some other node in the path kept logs and the attacker got access to those logs, that would also allow the correlation, but that's rather an edge case and we don't claim to have an exhaustive list, so I don't see a need to add complications to this text. Use of temporary addresses will not prevent such payload-based correlation, which can only be addressed by widespread deployment of encryption as discussed in [RFC7624]. Nor will it prevent an on-link observer (e.g. the node's default router) to track all the node's addresses. nit: s/to track/from tracking/ Section 2.1 Many nodes also have DNS names associated with their addresses, in which case the DNS name serves as a similar identifier. Although the DNS name associated with an address is more work to obtain (it may require a DNS query), the information is often readily available. In such cases, changing the address on a machine over time would do little to address the concerns raised in this document, unless the DNS name is changed as well (see Section 4). nit: perhaps say "at the same time"? The use of a constant identifier within an address is of special concern because addresses are a fundamental requirement of communication and cannot easily be hidden from eavesdroppers and other parties. Even when higher layers encrypt their payloads, (editorial) the two paragraphs before this one seem to be examples (DNS names, HTTP cookies) of "identifier[s] that [are] recognizable over time within different contexts" as discussed in the paragraph prior to them. This paragraph is getting back to why we care about constant identifiers in IP addresses; I wonder if some kind of (list?) formatting for the previous two paragraphs might help indicate the structure of the discussion. Changing global scope addresses over time limits the time window over which eavesdroppers and other information collectors may trivially correlate network activity when the same address is employed for multiple transactions by the same node. Additionally, it reduces the window of exposure of a node via an address that gets revealed as a result of active communication. I'm not 100% sure that I understand what is being exposed by this "window of exposure" -- is it just "there is a node at this address and it is responsible for all activities using that address"? Thus, perhaps "window of exposure for such correlation"? (Similar text also appears in the Abstract.) The security and privacy implications of IPv6 addresses are discussed in detail in [RFC7721], [RFC7707], and [RFC7217]. A sentence essentially identical to this one already appeared in the Introduction; I'm not sure if we should de-duplicate. Section 3.1 4. Temporary addresses must have a limited lifetime (limited "valid lifetime" and "preferred lifetime" from [RFC4862]), that should be statistically different for different addresses. The lifetime We should probably be more specific about what "statistically different" is supposed to mean. For example, is it intended to relate to the initial value associated with a freshly generated address (i.e., "should not be exactly 24 hour lifetime at time of generation") or the offset between different addresses ("should not be exactly 24 hours more than the previous one")? 5. By default, one address is generated for each prefix advertised by stateless address autoconfiguration. The resulting Interface Identifiers must be statistically different when addresses are configured for different prefixes. That is, when temporary [In contrast, this use of "statistically different" is both (1) clarified further and (2) for a time-independent quantity, so the interpretation is pretty clear as-is.] Section 3.3.1 I think we need to say something about the random number being long enough or getting more random bits in step 2 if there aren't enough bits, or similar. Just "obtain a random number" doesn't say what the number is sampled from, and could cover, e.g., https://xkcd.com/221/ . Section 3.3.2 1. Compute a random identifier with the expression: RID = F(Prefix, Net_Iface, Network_ID, Time, DAD_Counter, secret_key) [...] F(): A pseudorandom function (PRF) that MUST NOT be computable from the outside (without knowledge of the secret key). F() MUST also be difficult to reverse, such that it resists attempts to obtain the secret_key, even when given samples of the output of F() and knowledge or control of the other input parameters. F() SHOULD produce an output of at least 64 bits. F() could be implemented as a cryptographic hash of the concatenation of each of the function parameters. SHA-256 [FIPS-SHS] is one possible option for F(). Note: MD5 [RFC1321] is considered unacceptable for F() [RFC6151]. I recognize that this is just the RFC 7217 construction with the 'Time' parameter added, but it's not entirely clear that we want to be recommending the plain "hash of concatenation" option without additional caveats. While having the secret key be the last element in the bitstring seems to close off the length-extension class of attacks, we don't say anything about performing the concatenation with fixed-width types (or a length prefix), as is needed for non-malleability of the hash inputs. (This is particularly of note for the IPv6 prefix, that one might naturally encode as just the prefix parts, not necessarily fixed length, but also applies to other parameters, including some of the "Net_Iface" examples given in RFC 7217.) There is also no discussion about the potential for hash collisions (or, more generally, attacks) across this construction and the RFC 7217 construction. Guidance to not reuse a secret_key for both constructions would be in order. (I will note that it may be tempting to upgrade to an HMAC construction, and while that will certainly work, modulo the need for length prefixes/fixed-length input, it is overkill for this case.) Finally, the guidance for "SHOULD produce an output of at least 64 bits" could perhaps be revisited; any useful cryptographic hash these days is going to have at least 128 bits of output, which is certainly enough for generating an IID! Prefix: The prefix to be used for SLAAC, as learned from an ICMPv6 Router Advertisement message. (side note) is the "as learned from an ICMPv6 RA" an important prerequisite, or could a prefix learned in some other fashion still be usable? which this interface is associated. Additionally, Simple DNA [RFC6059] describes ideas that could be leveraged to generate a Network_ID parameter. This parameter is SHOULD be employed if some form of "Network_ID" is available. nit: s/is SHOULD/SHOULD/ Section 3.4 7. The node MUST perform duplicate address detection (DAD) on the generated temporary address. If DAD indicates the address is already in use, the node MUST generate a new randomized interface identifier, and repeat the previous steps as appropriate up to TEMP_IDGEN_RETRIES times. If after TEMP_IDGEN_RETRIES consecutive attempts no non-unique address was generated, the node MUST log a system error and SHOULD NOT attempt to generate a temporary address for the given prefix for the duration of the node's attachment to the network via this interface. [...] Just to confirm my understanding: "for the duration of the node's attachment to the network" means that even if a new RA+PIO is received, the node still ignores that prefix? Section 3.6 determine that the link change has occurred. One such process is described by "Simple Procedures for Detecting Network Attachment in IPv6" [RFC6059]. Detecting link changes would prevent link down/up nit: we have already referred to the abbreviated name "Simple DNA" earlier in this document, so the expanded title does not seem necessary here. Section 3.8 REGEN_ADVANCE -- 2 + (TEMP_IDGEN_RETRIES * DupAddrDetectTransmits * RetransTimer / 1000) Please indicate the units of this value (the division by 1000 indicates it is likely measured in seconds). DESYNC_FACTOR -- A random value within the range 0 - MAX_DESYNC_FACTOR. It is computed once at system start (rather than each time it is used) and must never be greater than (TEMP_PREFERRED_LIFETIME - REGEN_ADVANCE). Computing only at startup and not changing it could perhaps run into issues with maintaining the invariant, when TEMP_PREFERRED_LIFETIME and REGEN_ADVANCE are configurable after startup. (Changing DESYNC_FACTOR more often, and having the range be more like half of the overall lifetime, would be one approach for achieving the "statistically different" property mentioned in my Discuss point.) Section 4. The desires of protecting individual privacy versus the desire to effectively maintain and debug a network can conflict with each other. [...] (editorial) this sentence lacks parallelism of structure. Perhaps: % The desire to protect individual privacy can conflict with the desire % to effectively maintain and debug a network. Section 5 o Addresses all errata submitted for [RFC4941]. There are errata reports against RFC 4941 that are still in the state "reported"; the responsible AD should probably process those before this document gets published. Section 9 Overall these security considerations seem pretty comprehensive and well-described -- thank you! If a very small number of nodes (say, only one) use a given prefix for extended periods of time, just changing the interface identifier part of the address may not be sufficient to mitigate address-based network activity correlation, since the prefix acts as a constant identifier. [...] It might be worth noting some scenarios where this commonly occurs, e.g., residential households that only have a single computer. (Is it also the case for mobile phones?) fairly large number of nodes. Additionally, if a temporary address is used in a session where the user authenticates, any notion of "privacy" for that address is compromised. Compromised for the part(ies) that receive the authentication information, at least. That does not necessarily include a passive observer in the network. While this document discusses ways of obscuring a user's IP address, the method described is believed to be ineffective against I don't think "obscuring" is the right word -- the IP address is still visible; we're just trying to remove some of the information content from it over long periods of time. I understand the desire to remove the word "permanent" from the RFC 4941 version, but this still doesn't seem right. Perhaps the goal could be rephrased as something about making the IP address less useful as a persistent (numerical) identifier. Ingress filtering has been and is being deployed as a means of preventing the use of spoofed source addresses in Distributed Denial of Service (DDoS) attacks. In a network with a large number of nodes, new temporary addresses are created at a fairly high rate. This might make it difficult for ingress filtering mechanisms to distinguish between legitimately changing temporary addresses and spoofed source addresses, which are "in-prefix" (using a topologically correct prefix and non-existent interface ID). This can be addressed by using access control mechanisms on a per-address basis on the network egress point. Should we say something about the corresponding resource consumption increase at the egress point? Section 11.1 One might argue that RFC 7217 is merely informative, since we duplicate in full the IID-generation algorithm from it (with modifications). RFC 8190 is only referenced to note that we specifically do *not* use terminology from it; that seems like it does not really meet the threshold for being a normative reference.
Glad to see this update.
I support Ben Kaduk’s DISCUSS position around the need for clarity on what “statistically different” means (per Section 3). I also strongly concur with Ben’s guidance on the construction of the RID in Section 3.3.1 ** Section 3.*. The guidance uses the language of “deprecate”. This to me would suggest some notion of state being kept where I know I’ve used an address before and I won’t use it again. However, that doesn’t seem right here. ** Section 3.1. Per “it must be difficult for an outside entity …”, what’s the rough thinking on the workload for “difficult” or is there more precise language. I ask because Section 2.1 ** Section 3.3. The subsections present two different algorithms. Is there an MTI approach? ** Section 3.3.2. This section suggests that use of SHA-256, but is there normative guidance on an MTI algorithm? ** Section 3.3.2. If the hash algorithm output length exceeds the needed identifier length, how should truncations be handled? ** Editorial -- Section 1.2. Nit. For consistency, s/on-link/on path/
I support Benjamin's DISCUSS. The shepherd writeup says "The original authors from RFC4941 have not been reachable", but those were S. Krishnan, R. Draves, and T. Narten, who are three of the authors of this document. I'm confused.
I support Ben's DISCUSS.
§3.3 (Generation of Randomized Interface Identifiers) starts by saying that the "subsections specify example algorithms". Are these algorithms just examples? Digging through the archive, it looks like they are: "It's one possible algorithm, and not necessarily the recommended one."  However, §3.4 (Generating Temporary Addresses) says this: 6. New temporary addresses MUST be created by appending a randomized interface identifier (generated as described in Section 3.3 of this document) to the prefix that was received. IOW, it makes the algorithms in §3.3 mandatory ("MUST...as described in Section 3.3"). Please clarify the text one way or the other: by eliminating "examples" from §3.3, or removing the text in parenthesis in §3.4.  https://mailarchive.ietf.org/arch/msg/ipv6/OVdexfOXgdDM4r1QZ3iQcA_oWVQ
Thank you for the work put into this document. It is an important topic and it had stimulated a lot of hot discussions on the 6MAN list ;-) I also appreciate the new section 3.7 where an admin can disable the mechanism. Is there a reason why this document does not briefly compare its addresses with RFC 7217 (stable address) ? It could be helpful for the reader. Please find below a couple of non-blocking COMMENT points and nits. I hope that this helps to improve the document, Regards, -éric == COMMENTS == Should link-local address generation also be considered here ? While the text is clear about 'global', there is no clear indication that this document does not apply to link-local addresses. -- Section 1 -- First sentence, should "or by static configuration" be added ? Should a reference to DHCPv6 be added ? -- Section 1.2 -- Should IPPIX collectors also be mentioned in the first bullet list ? On path attacker (really close to the node though !) can also do the correlation based on the layer-2 address. Should this be added to the 2nd list ? -- Section 2.2 -- I find the focus on DNS as 'rendez-vous' a little limiting. Why not mentioning DNS-SD or SIP proxy or ... Perhaps, prefixing the explanation with "When DNS is used" rather than using "machine would need a DNS name" (and BTW, it is also limiting to refer to a 'machine' as per container IPv6 address could be used) -- Section 3.1 -- Point 5) the 2nd and 3rd sentences are really repeating themselves and not bringing a lot of value. Let's keep only one of the 2 or even none as the last sentence is really clear. -- Section 3.4 -- Should the text be clear on whether optimistic DAD may be used? -- Section 3.6 -- Last paragraph "when an interface connects to a new (different) link, a new set of temporary addresses MUST be generated immediately" seems to imply more than 1 temporary address with the use of 'set' and the plural form. Unsure whether it is the right behavior (esp if no RA-PIO are received yet). -- Section 6 -- If only this was not 'future work' but I agree, this is not up to this document to specify such kernel API/implementation. -- Section 9 -- Should the ingress filtering be also part of the section 4 (implications) ? -- Section 11.2 -- I am afraid that RFC 7721 should be normative (introducing a downref though) as it is used in section 1.1 (terminology).
I understand that this is an update of an older document resolving several important issues. However, what was advanced traffic analysis 10 years ago is not as advanced today. The security consideration discuss some of the weakness. To me it appears that there are significant risks of correlation old temporary address passed preferred life time with the new preferred temporary address. Especially if an attacker can trigger an endpoint reconnecting to a site where the previous temporary address was used and thus correlate the attempt to force reconnection combined detected use of a new temporary address to the same destination. It might even be another destination but associated with the same remote site. I have not putt this on discuss level, but my impression is that although beneficial the strength of its protection might be overstated in the various statements.