Summary: Has enough positions to pass.
Thank you for this document. Among other helpful things, it includes the most clear and memorable description of the distinction between PMTUD and PLPMTUD that I've heard yet, which I expect I will even be able to remember for the next time I need it. Section 2.1 Each link is constrained by the number of bytes that it can convey in a single IP packet. This constraint is called the link Maximum Transmission Unit (MTU). Whlie the end-to-end Path MTU is the size of a single IPv4 header, IPv4 [RFC0791] requires every link to support at least a specified MTU (see NOTE 1). IPv6 [RFC8200] I don't understand what "the end-to-end Path MTU is the size of a single IPv4 header" means nit: s/Whlie/While/ Section 3.2 It might be worth another sentence to codify that sending the fragments of a message to different next hops will end poorly. Section 3.8.1 The effect of rate limiting may be severe, as RFC 4443 recommends strict rate limiting of IPv6 traffic. nit: s/IP/ICMP/ Section 4.1 nit: s/User Data Protocol/User Datagram Protocol/ Section 4.2 nit: s/is sufficiently small is sufficiently small/is sufficiently small/
It's very seldom that I ballot Yes on a document for which I'm not the Responsible AD, but this is important enough that I'm doing so; unfortunately there are are some bits which make me uncomfortable though, and so I spent a while in the unusual situation of trying to decide between DISCUSS and YES - after looking at the author list and responsible AD I'm sure that my comments will be considered, and so I'm balloting Yes. 1: "Legacy protocols that depend upon IP fragmentation SHOULD be updated to remove that dependency." I really don't like the SHOULD here -- while I fully agree that legacy protocols should be update, the RFC2119 usage feels weird - it's unclear exactly who it is aimed at (everyone? the people who wrote the legacy protocols? some mythical cleanup author?) 2: I'm unclear why IP-in-IP tunnels are called out at the top / in the Introduction. There is a whole section (Packet-in-Packet Encapsulations) where I think it would go better -- I see no harm in having people have to read down to there to note this. 3: "NOTE 2: A non-fragmentable packet can be fragmented at its source. However, it cannot be fragmented by a downstream node. An IPv4 packet whose DF-bit is set to 0 is fragmentable. An IPv4 packet whose DF-bit is set to 1 is non-fragmentable. All IPv6 packets are also non-fragmentable." I have a few issues with this: 3.1: I'm not really sure a non-fragmentable packet can be fragmented at its source -- the packet *can* be fragmented but I'd say that that is before it has become non-fragmentable. It's entirely possible that I'm missing something obvious here, but a skim of 791 didn't show me what.... 3.2: This may be a corner case, but some tunneling gear seems happy to ignore the DF bit when it is doing reassembly on the far side of the tunnel -- the logic seems to be that as long as what goes into the tunnel matches what comes out it doesn't matter what actually happens *inside* the tunnel. 3.3: Related to this - most tunneling gear (and many firewalls) allow you to clear the DF bit in packets -- for example, cisco has the 'crypto ipsec df-bit clear' command, Junos allows you to do 'services ipsec-vpn rule myvpn term stomp_on_df then clear-dont-fragment-bit;', iptables lets you 'iptables -t mangle -A POSTROUTING -j DF --clear' 3.2 and 3.3 are common enough that I think that they deserve mention. 4: What do you mean by middle box in section 6.3? Yes, I know what is commonly called a middlebox, but I don't know of a good reference -- as an example, I've got some honkin' big routers which do stateless firewall filtering -- these are covered by 3.7, but are they also middleboxes, and if not, why not?! (Note, my personal belief is that they aren't, but I cannot really point to why, other than "I know a middlebox when I see one". There is a discussion on some of this in "Why Operators Filter Fragments and What It Implies" (draft-taylor-v6ops-fragdrop-02), but this also uses the term middlebox without defining it. Nits: A: Whlie -- While
Thanks for writing this document! I think this will be a very useful reference also for us ADs! Also thanks for addressing all the comments of TSVART early review (and thanks for Gorry for this really good and detailed review!). I'm not certain if the following comment from that review was finally resolved. Can you maybe double-check with Gorry: From Gorry's mail on May 29 ">> Section 4.6 >> You may find it useful to check and refer to the following IPv4 specs that relate to the use of fragments: >> - Please note the recommendation to check details and cite RFC6864 on IPv4 Fragment ID. >> - Please note the recommendation to check and cite RFC 4787 on NAT handling of IP fragments. > I'm hard pressed to say what changes you would suggest, or what section you want to see a reference in. To be very honest, a two minute timeout makes a > lot of sense in an Internet in which a packet requires a significant portion of a second to transmit; in a megabit-to-gigabit Internet, the corresponding > interval is probably measured in seconds or, at most, tens of seconds. In any event, I would want to see data supporting the recommendation. > > Please be more specific. RFC 6864 is in the references - which I think is good. Perhaps a way to resolve my comment would be for section 2.1 to mention IPID and then cite RFC6864. I'd be happy to see more text on this topic, but I suspect it is sufficient to simply ensure the reader is aware that these RFCs impact fragmentation. I was imagining something like this: The set of IPn fragments comprising a complete IPv4 datagram all carry the same non-zero value in the IPID field. RFC 6848 describes issues relating to the handling of the IPv4 ID field in middleboxes. Section 10 and 11 of RFC 4787 describes best current practice for handling fragmentation in network devices performing Network Address Translation (NAT)." And one more minor comment: Se 6.1: "In these cases, the protocol will continue to rely on IP fragmentation but should only be used in environments where IP fragmentation is known to be supported." Should this "should" be a "SHOULD"?
I also support Alissa’s DISCUSS.
Thanks for addressing my DISCUSS.
** I support Alissa Cooper's discuss item ** Section 3.7. Per the discussion about NIDS, evasion using fragments also arose when stateless pattern matching occurred. ** Section 3.7. Related to NIDS, naïve flow-based anomaly detection systems/analytics have also been known to introduce false positives, if IP packet counts are confused with IP fragment counts. ** Editorial -- Section 1. Per “but the designer should to be aware that fragmented packets may result in blackholes”, the reference to a “blackholes” seems imprecise. -- Section 2.1. Typo. s/Whlie/While/ -- Section 3.8.2. Recommend adding a sentence at the end of the first paragraph to suggest this is just an example. I’ve seen even worst default ICMP policies in consumer routers. -- Section 3.8.2. Typo. s/a incorrect/an incorrect/ -- Section 5.1. Typo. s/signalling/signaling/
I enjoyed reading this document.
I support Alissa's DISCUSS. [nit] s/OSPFv3 [RFC2328][RFC5340]/OSPFv3 [RFC5340]
Thank you for the work put into this document, a real problem in the real Internet I have a couple of COMMENTs below in the hope to improve the quality of the document. Regards, -éric == COMMENTS == -- Section 2.1 -- "An Internet path connects a source node to a destination node", what about multicast? If the document is for unicast, then it should be stated early in the document. s/If a link fails,/If a link or a router fail,/ A reference to "see NOTE 1" (using xref/target in XML) would be easier for the reader. Same for "NOTE 2" of course. -- Section 2.2 -- "In IPv4, the upper-layer header usually appears in the first fragment" perhaps the right place to mention RFC 3128 (informational) ? -- Section 2.3 -- Perhaps explicitly mention that PLPMTUD is therefore more reliable than ICMP-based PMTUD but not applicable to all traffic? -- Section 3.1 -- Please add reference to A+P (or better MAP ?) and CGN ? (done only in section 3.3) More generally, it is not clear for the reader why virtual reassembly increases the fragility. -- Section 3.3 -- "NOTE 1" is it the same as in section 2.1? Also add xref/target for the reader's convenience. -- Section 3.4 -- Suggest to replace "trailing fragments" by "non-initial fragments" as it is more accurate. -- Section 3.6 -- The section title refers to "data rate" while it is rather "fragment generation rate" (which will of course increase the data rate as a consequence). BTW, I never thought about that ;-) -- Section 3.7 -- Is it still the case that "Many implementations set the Identification field to a predictable value" ? Would it be possible to have some data backing this statement? -- Section 3.8 -- AFAIK, RFC 4890 is only applicable to IPv6 ICMP and not to IPv4 ICMP messages... please rewrite this part. -- Section 3.8.1 -- s/ICMP rate limiting/ICMP generation rate limiting/ Also RFC4442 is IPv6 specific. Add some text for IPv4 ? -- Section 3.8.2 -- Unsure what the "zone-based" has to do with the content of this section. -- Section 3.8.3 -- It may be worth adding that the paths TO and FROM the anycast DNS server can be different, hence, causing the described problem. -- Section 3.8.4 -- While actually correct, this would also mean that BCP38 is not implement in this network. -- Section 6.3 -- "that behavior MUST be clearly documented" unsure whether a "MUST" can be used for a non protocol action. -- Section 6.5 -- "MAY rate limit ICMP messages" please state "MAY rate limit the generation ICMP messages" Unsure how "Network operators SHOULD NOT filter IP fragments " can be implemented as non-initial fragments have no UDP/TCP ports... Doing statefull filtering is probably undoable. Even with a SHOULD this is probably too restrictive. == NITS == -- Section 2.1 -- s/Whlie/While/