Summary: Has a DISCUSS. Has enough positions to pass once DISCUSS positions are resolved.
In Section 2.3 we make a claim about item 'e)' of section 5.5.3 of RFC 4862, in particular that it says that 'an RA may never reduce the RemainingLifetime" to less than two hours', but the relevant text from RFC 4862 seems to be: 2. If RemainingLifetime is less than or equal to 2 hours, ignore the Prefix Information option with regards to the valid lifetime, unless the Router Advertisement from which this option was obtained has been authenticated (e.g., via Secure Neighbor Discovery [RFC3971]). If the Router Advertisement was authenticated, the valid lifetime of the corresponding address should be set to the Valid Lifetime in the received option. which clearly allows an *authenticated* RA to reduce the "RemainingLifetime" to smaller values. (Text with a similar not-quite-accurate statement appears in Section 2.4 of this document as well.)
I look forward to seeing the more comprehensive solution/advice in draft-ietf-6man-slaac-renum and draft-ietf-6man-cpe-slaac-renum progress. I note that https://www.rfc-editor.org/materials/abbrev.expansion.txt does not mark "CPE" as "well-known", implying that we should probably expand it on first use. Section 1 Scenarios where this problem may arise include, but are not limited to, the following: o The most common IPv6 deployment scenario for residential or small office networks is that in which a CPE router employs DHCPv6 Prefix Delegation (DHCPv6-PD) [RFC8415] to request a prefix from an Internet Service Provider (ISP), and a sub-prefix of the leased (nit/editorial) this construction looks like "scenarios where Q might happen include: A common scenario for X is Y. Sometimes, Y can be a scenario where Q happens." Pedantically, the list contents don't match the list introduction, since the extra introductory material doesn't match the classifier attempting to describe it. o A router (e.g. Customer Edge router) may advertise autoconfiguration prefixes corresponding to prefixes learned via DHCPv6-PD with constant PIO lifetimes that are not synchronized with the DHCPv6-PD lease time (as required in Section 6.3 of [RFC8415]). [...] nit: I suggest the parenthetical be "even though Section 6.3 of [RFC8415] requires such synchronization", since the present formulation is potentially unclear about which behavior is the required one. This means that in the aforementioned scenarios, the stale addresses would be retained and also actively employed for new communications instances for an unacceptably long period of time (one month, and one week, respectively), leading to interoperability problems, instead of hosts transitioning to the newly-advertised prefix(es) in a timelier manner. I am not 100% sure that "would" is always applicable, as I could imagine situations that conform to the above-listed scenarios yet have qualifying factors that result in non-use of the stale addresses. Perhaps "could" is more appropriate? Section 2.2 IPv6 SLAAC employs the following default PIO lifetime values: o Preferred Lifetime (AdvPreferredLifetime): 604800 seconds (7 days) o Valid Lifetime (AdvValidLifetime): 2592000 seconds (30 days) We noted these values previously, in the Introduction. Is it useful to repeat them in both locations? Under problematic circumstances, such as where the corresponding network information has become stale without any explicit signal from the network (as described in Section 1), it will take a host 7 days (one week) to deprecate the corresponding addresses, and 30 days (one I suggest "up to" [7 days ...]. Section 2.5 At times, the prefix lease time is fed as a constant value to the SLAAC router implementation, meaning that, eventually, the prefix lifetime advertised on the LAN side will span *past* the DHCPv6-PD lease time. This is clearly incorrect, since the SLAAC router implementation would be allowing the use of such prefixes for a longer time than it has been granted usage of those prefixes via DHCPv6-PD. I recognize that this is an informational document and we're not obligated to give advice, but we've given advice elsewhere in the document and it feels weird to end the section on such a grim note. Should we say something about such implementations ideally getting updated to reflect the specification? Section 3.2 NOTES: A CPE router advertising a sub-prefix of a prefixed leased via DHCPv6-PD will periodically refresh the Preferred Lifetime and the nit: s/prefixed leased/prefix leased/ Section 6 This document does not introduce any new security issues. (side note) we sort-of recommend using different values for AdvPreferredLifetime/AdvValidLifetime, which would presumably affect the tradeoffs for robustness vs. susceptibility to attack. But the values from RFC 4861 are just "defaults", so there's a reasonable claim to be made that the relevant security considerations should have been covered in 4861 itself and we don't need to say more here. Section 8.1 It's not clear to me that the one place we cite RFC 4941 qualifies as a normative reference. Section 8.2 The use of "[Linux]" as a slug for referencing a post to netdev is perhaps debatable. The way in which we cite RFC 6724 seems similar to the way in which we cite, e.g., RFC 8028 (which is listed as normative).
It seems like the first paragraph of Section 4 should be removed since it isn't future work at this point.
Thanks to Klaas Wiereng for the SECDIR review. ** Section 6. As Section 3.2 is proposing tuning the parameters in RFC4861, it is likely worth reiterating that these security considerations still apply ** Editorial -- Section 1. Editorial. s/and and/and/ -- Section 2.2. Most of this text was already stated in Section 1.
I would like a paragraph somewhere about what happens today in the network without these mitigations. Presumably in most cases the outage doesn't persist for 30 days, or whatever? Do people just reboot endpoints? Is there a service call that results in manual IPv6 address reconfiguration?
[[ comments ]] [ section 2.1 ] * There should be some clarification that use of dynamic prefixes does not automatically imply flash renumbering, but rather that it increases the likelihood of a flash renumbering event occurring (basically make it clear that flash renumbering is the issue, not dynamically changing prefixes). * There's also more than one layer of PD stability to be considered: the stability of the block delegated from the ISP to the modem/ISP-provided CPE (discussed here), and (for example) the stability of the prefix that's subdelegated to another router in the home (in cases where the user has purchased an additional router to place between nodes in the home and the ISP CPE). In this way, even with stable ISP->CPE prefix delegation, it might be possible for the home router to get a different subprefix on reboot. [ section 2.2 ] * This section should make it clear that magnitude of the impact is a function of these timers and that these defaults are not necessarily in common use. The text strongly implies that all flash renumbering events impact hosts for 7 days, and I don't think that's true. (I don't think I've been on any dynamic prefix network that used these defaults for a long time.) [ section 2.3 ] * I support Ben's observation about authenticated RAs. [[ nits ]] [ abstract ] * "will continue using stale prefixes" -> "may continue using stale prefixes" or "could" or "might" or "are likely to" I think "will" is only correct under very certain circumstances. Same text change in the first paragraph of the introduction as well. [ section 1 ] * "and and" -> "and" * "configure for": perhaps "configured from" the previously-advertised prefix
I concur with Martin Duke's suggestion. Otherwise, I've just a couple of nits here: Abstract: * "This document documents this issue ..." -- how about "describes"? Section 1: * s/timelier/more timely/
Thank you for the work put into this document. It is easy to read but errs sometimes on the anecdotes side rather than on the facts side (except for Jordi's survey). As discussed before, I personaly wonder whether it is a real problem for the IETF: it is largely about CPE/node implementation issues and not a protocol one (even if I agree that the RFC 4861 default timers were badly chosen 20 years ago). Please find below a couple of non-blocking COMMENT points and one nit but please also check: - Ted Lemon's IoT directorate review with his note about sleeping devices and time-outs: https://datatracker.ietf.org/doc/review-ietf-v6ops-slaac-renum-04-iotdir-telechat-lemon-2020-10-19/ - Sheng Jiang's Internet directorate review: https://datatracker.ietf.org/doc/review-ietf-v6ops-slaac-renum-04-intdir-telechat-jiang-2020-10-19/ I hope that this helps to improve the document, Regards, -éric == COMMENTS == -- Section 1 -- "for an unacceptably long period of time, thus resulting in connectivity problems." while the 'long period of time" is explained in the end of this section, giving a hint would help the reader to appreciate the problem. I found this introduction rather qualitative than quantitative. "it is not an unusual behavior" may be... but, documenting (OS versions, specific use cases) would make this argument stronger. "there has been evidence that some 802.1x supplicants do not reset network settings after successful 802.1x authentication." this is a very outdated behavior of Windows if not mistaken and fixed years ago. In all cases, documenting (OS version, specific case) would make this argument stronger. "Lacking any explicit signaling to deprecate the previously-advertised prefixes", as the explicit mechanism exists, I suggest to s/explicit/reliable/ "because of egress-filtering by the CPE or ISP" or is it ingress-filtering when packets are sourced by an 'internal' host ? "or routed elsewhere" I wonder how a packet could be routed elsewhere if the source address is wrong. Policy-based routing ? Suggest to remove those words. -- Section 2.1 -- Jordi's survey (a good one) does not say how often and how planned are those prefix changes? My own /48 is not 'stable per contract' but has been stable for 7 years (as long as the BNG does not change it will stay the same as AAA & DHCP are linked together at my ISP). So, the 37% is probably not meaning that 37% of the CPE are changing of prefix everyday. Did the authors check with the German ISP? AFAIK, the default policy has changed. Suggest to change the reference from RFC 4941 to the -bis document that is in the same IESG telechat (so will not cause delay in the publication of this document). -- Section 2.3 -- No reply required but I find the last part of this section quite smart. I would not have thought about this corner case ;-) -- Section 2.5 -- "Not unusually, the two protocols are implemented in two different", I am afraid that this is again 'anecdote' and not 'facts'. Citing implementations details would make this statement stronger. -- Section 3 -- To be honest, I was about to ballot a DISCUSS on this section as I think that there could be other mitigation techniques. E.g., the ISP could advertise 2 prefixes by DHCP-PD for a while (would need to re-read DHCPv6 to be sure) allowing an easier roll-over of the prefixes (esp. when planned prefix change). Or possibly remove completely this section as 3.1 obvious and it is not exhaustive. -- Section 3.2 -- While I agree that the default timers are wrong and that the suggested values are way better, this is a change in the CPE doing SLAAC and not in operator settings (or did I badly understood 'operator' as 'network operator' in the sense of ISP as opposed as residential user?). Is the same technique also described in the SLAAC CPE document ? == NITS == -- Section 3.2 -- The use of () in the first paragraph renders it difficult to parse. Consider rewriting it.
Hi, Thank you for this document that highlights an operational issue. My same comments regarding the acknowledgements and references as for draft-ietf-v6ops-cpe-slaac-renum-05 also apply here. Thank you to Juergen for the Opsdir review. I also broadly agree with his comments. Although tweaking the SLAAC timers helps reduce this problem somewhat, it doesn't seem to mitigate it altogether. Ideally, there would be a way for the SLAAC protocol to indicate that the advertised prefixes replace all prefixes that had previously been advertised by that device. Hopefully draft-ietf-v6ops-slaac-renum will specify suitable mitigation. I also agree with Juergen's statement regarding trying to make hosts more robust if they detect connectivity failures, particularly if there are multiple prefixes available that they could choose from. I don't know if this might be worth mentioning in section 4 on Future Work? Regards, Rob