IPv4 Reassembly Errors at High Data Rates
draft-heffner-frag-harmful-05
Revision differences
Document history
Date | Rev. | By | Action |
---|---|---|---|
2012-08-22
|
05 | (System) | post-migration administrative database adjustment to the No Objection position for Cullen Jennings |
2012-08-22
|
05 | (System) | post-migration administrative database adjustment to the Yes position for David Kessens |
2007-05-06
|
05 | (System) | IANA Action state changed to No IC from In Progress |
2007-05-06
|
05 | (System) | IANA Action state changed to In Progress |
2007-05-03
|
05 | Amy Vezza | State Changes to RFC Ed Queue from Approved-announcement sent by Amy Vezza |
2007-05-02
|
05 | Amy Vezza | IESG state changed to Approved-announcement sent |
2007-05-02
|
05 | Amy Vezza | IESG has approved the document |
2007-05-02
|
05 | Amy Vezza | Closed "Approve" ballot |
2007-05-01
|
05 | Lars Eggert | State Changes to Approved-announcement to be sent from IESG Evaluation::AD Followup by Lars Eggert |
2007-05-01
|
05 | Cullen Jennings | [Ballot Position Update] Position for Cullen Jennings has been changed to No Objection from Discuss by Cullen Jennings |
2007-05-01
|
05 | (System) | Sub state has been changed to AD Follow up from New Id Needed |
2007-05-01
|
05 | (System) | New version available: draft-heffner-frag-harmful-05.txt |
2007-04-30
|
05 | Lars Eggert | Waiting for -05 to show up in the repository. |
2007-04-28
|
05 | Cullen Jennings | [Ballot discuss] 05 version resolves all issues |
2007-04-03
|
05 | Lars Eggert | State Changes to IESG Evaluation::Revised ID Needed from IESG Evaluation::AD Followup by Lars Eggert |
2007-04-03
|
05 | Lars Eggert | The ongoing discussion seems to suggest that a revision will be needed. |
2007-02-13
|
05 | Cullen Jennings | [Ballot discuss] I asked Eric Rescorla to review this and he sent the following which does get at exactly my concerns. The -04 revision of … [Ballot discuss] I asked Eric Rescorla to review this and he sent the following which does get at exactly my concerns. The -04 revision of this document doesn't really address most of my issues. In a number of cases the authors provided some comments in e-mail but they don't seem to have made it into the document. 1. - This was deal with. 2. If we assume an initial packet size of 10K (10^4 bytes) then the 10 TB (10^13) bytes represents (10^9 packets). Stone et al. report error rates of of between 10^-3 and 10^-5. Accordingly, even ignoring fragmentatio issues we would expect to see btwn 10^4 and 10^6 errored packets. The authors report 10^7 errored packets. OTOH, if we assume a more plausible 1500 byte MTU and that the intervening links are smaller, then the number of errored packets due to fragmentation is commensurate with the number of pure data errors. I would think the bottom line here is that you can't trust the TCP/UDP checksum if you can't accept a base packet error rate of 10^-5 or so packets. This was to some extent addressed in e-mail but I don't see anything in the draft. I was hoping the authors would add some discussion of Stone's results, but that doesn't seem to have happened. 3. The effective data rate here is quite high (100 Mb/s) for the Internet environment. What would the situation look like at a more practical real-world data rate of (e.g., 10 Mb/s). Certainly, we should confine this to WAN contexts since in a LAN you can directly observe your MTU and Kent et al. should have convinced people not to send data which has to be fragmented at their outgoing interface. Again, this was sort-of addressed in e-mail but I don't see anything in the draft addressing it. 4. Regarding the flow control/false congestion issue, how much worse is this setting going to be than one would expect from Kent et al.'s work? If you only have burst losses, wouldn't you expect that a TCP-friendly protocol would back off to the point where the the fragment reassembly buffers start to time out and the cycles time out. Again, this was sort-of addressed in e-mail but I don't see anything in the draft addressing it. Finally, I'm trying to figure out what the take-home message of this document is. How would implementors behave differently after reading this document than they would have before? After exchanging e-mail with the authors I'm even more confused about this. Here's the relevant portion: Further, the problem definitely does exist on LANs. One of the earliest reported cases of this issue was on NFS servers using early 100 Mbps NICs. NFS transmits one block per datagram, and if you configure it with a normal block size of 2k or 4k, it fragments. (You end up getting better performance with the larger block size than with 1k blocks.) The high rate fragmentation led to occasional file corruption, in a situation where otherwise the bit error rate was very low. Part of our motivation in writing this is that it seemed Kent, et al. were not always fully persuasive in deterring the use of fragmentation. There have been a number of UDP bulk transport tools developed to work around some TCP issues. For example, see , . Some of these tools use fragmentation to reduce syscall CPU overhead, since it increases their overall performance, regardless of the Kent/Mogul issues. Isn't the take-home message here that fragmentation is sometimes good, provided that you use a strong checksum to detect and compensate for occasional corruption? |
2007-01-29
|
05 | David Kessens | [Ballot Position Update] Position for David Kessens has been changed to Yes from Discuss by David Kessens |
2007-01-26
|
05 | (System) | Sub state has been changed to AD Follow up from New Id Needed |
2007-01-26
|
04 | (System) | New version available: draft-heffner-frag-harmful-04.txt |
2006-12-15
|
05 | Amy Vezza | State Changes to IESG Evaluation::Revised ID Needed from IESG Evaluation by Amy Vezza |
2006-12-15
|
05 | (System) | Removed from agenda for telechat - 2006-12-14 |
2006-12-14
|
05 | Lisa Dusseault | [Ballot Position Update] New position, No Objection, has been recorded by Lisa Dusseault |
2006-12-14
|
05 | Ross Callon | [Ballot Position Update] New position, No Objection, has been recorded by Ross Callon |
2006-12-14
|
05 | Ross Callon | [Ballot comment] I agree with Dave Kessens that the title and abstract need to be changed to make it clear that this is really only … [Ballot comment] I agree with Dave Kessens that the title and abstract need to be changed to make it clear that this is really only talking about IP Fragmentation problems for extremely high bandwidth communication (which may indeed be quite important for supercomputer centers, but are not applicable for normal Internet users). However, I don't see any need to enter a "discuss" because Dave already has one for the same issue and I agree with Dave's proposed solution of changing the title and abstract to make it clear what the scope of this document actually is. |
2006-12-14
|
05 | Magnus Westerlund | [Ballot Position Update] New position, Yes, has been recorded by Magnus Westerlund |
2006-12-14
|
05 | Jari Arkko | [Ballot Position Update] New position, No Objection, has been recorded by Jari Arkko |
2006-12-14
|
05 | Dan Romascanu | [Ballot Position Update] New position, No Objection, has been recorded by Dan Romascanu |
2006-12-14
|
05 | Cullen Jennings | [Ballot discuss] I asked Eric Rescorla to review this and he sent the following which does get at exactly my concerns. $Id: draft-heffner-frag-harmful-03-rev.txt,v 1.2 2006/12/14 … [Ballot discuss] I asked Eric Rescorla to review this and he sent the following which does get at exactly my concerns. $Id: draft-heffner-frag-harmful-03-rev.txt,v 1.2 2006/12/14 05:39:35 ekr Exp $ This document argues that as a result of the IP fragmentation field being only 16 bytes, it is possible to confuse fragments from multiple packets and as a consequence produce pathological results. This is potentially an issue in two cases: - Bulk data transfer via UDP - Where TCP implementations don't set DF (rare) or where gateways don't respect it. The pathological results come in two flavors: - Corrupt packets being caught by the TCP/UDP checksum and being interpreted as congestion - Corrupt data passing the too-short TCP/UDP checksum and being passed to the application. After reviewing the experimental data in S 4 I'm not sure. it supports the author's conclusions. 1. It seems to be incompletely described. In particular, the authors don't describe the pre-fragmentation datagram size, which is important for interpreting the results. After all, if the datagram size is MTU then presumably they wouldn't get any fragmentation at all whereas if it's the same as the file size then presumably effective throughput would be near zero. This may be fixed in QUANTA but it needs to be described here. 2. If we assume an initial packet size of 10K (10^4 bytes) then the 10 TB (10^13) bytes represents (10^9 packets). Stone et al. report error rates of of between 10^-3 and 10^-5. Accordingly, even ignoring fragmentatio issues we would expect to see btwn 10^4 and 10^6 errored packets. The authors report 10^7 errored packets. OTOH, if we assume a more plausible 1500 byte MTU and that the intervening links are smaller, then the number of errored packets due to fragmentation is commensurate with the number of pure data errors. I would think the bottom line here is that you can't trust the TCP/UDP checksum if you can't accept a base packet error rate of 10^-5 or so packets. 3. The effective data rate here is quite high (100 Mb/s) for the Internet environment. What would the situation look like at a more practical real-world data rate of (e.g., 10 Mb/s). Certainly, we should confine this to WAN contexts since in a LAN you can directly observe your MTU and Kent et al. should have convinced people not to send data which has to be fragmented at their outgoing interface. 4. Regarding the flow control/false congestion issue, how much worse is this setting going to be than one would expect from Kent et al.'s work? If you only have burst losses, wouldn't you expect that a TCP-friendly protocol would back off to the point where the the fragment reassembly buffers start to time out and the cycles time out. Finally, I'm trying to figure out what the take-home message of this document is. How would implementors behave differently after reading this document than they would have before? |
2006-12-13
|
05 | David Kessens | [Ballot discuss] How real is this problem really for most Internet users ? The document says: For example, a host sending 1500 byte … [Ballot discuss] How real is this problem really for most Internet users ? The document says: For example, a host sending 1500 byte packets with a 30 second maximum packet lifetime could send at only about 26 Mbits/s before exceeding 65535 packets per packet lifetime. Or, filling a 1 Gbit/s interface with 1500 byte packets requires sending 65536 packets in less than 1 second, an unreasonably short maximum packet lifetime, being less than the round-trip time on some paths. This requirement is widely ignored. Basically, how many people are able to blast 1 Gbit/s out of a single interface but have such bad connectivity that they regurlarly have roundtrip times of 1 second or more. Or for the other example, a 30 second maximum lifetime seems rather high for most Internet communications that don't leave earth. This is not to say that fragmentation is not extremely harmful, for many other reasons that are not mentioned in this draft. Basically, this draft really only supports a title and abstract that says something like 'ipv4 fragmentation problems with 1500 byte packets and datarates higher than around 1Gbs'. This problem is really easy to fix: just change the title, abstract and introduction a little bit that makes it clear that the scope is quite limited (until most people get to enjoy 1gbs on every desktop). |
2006-12-13
|
05 | David Kessens | [Ballot Position Update] New position, Discuss, has been recorded by David Kessens |
2006-12-13
|
05 | Mark Townsley | [Ballot Position Update] New position, No Objection, has been recorded by Mark Townsley |
2006-12-13
|
05 | Ted Hardie | [Ballot Position Update] New position, No Objection, has been recorded by Ted Hardie |
2006-12-13
|
05 | Sam Hartman | [Ballot Position Update] New position, Yes, has been recorded by Sam Hartman |
2006-12-13
|
05 | Brian Carpenter | [Ballot comment] 5. Implications ... IPv6 is less vulnerable to this type of problem, since its fragment header contains a 32-bit identification field … [Ballot comment] 5. Implications ... IPv6 is less vulnerable to this type of problem, since its fragment header contains a 32-bit identification field [RFC2460]. Mis- association will only be a problem at packet rates 65536 times higher than for IPv4. Should note that IPv6 fragmentation only occurs e2e and there is no DF bit; hence errors caused by non-respect of the DF bit cannot occur. From Gen-ART reviewer Robert Sparks: "Some comments from a personal preference point-of-view: Consider changing the title to something describing the results directly - make this more likely to find when someone in the future uses the rfc-index to find issues with reassembly. Tuning the abstract to reflect the results rather than the consequenses of the results might also help draw eyes to the document, but I'm not sure how many people filter/choose documents based on the abstract text. On the other hand, a lot of the people you probably want to reach have been de-sensitized to fragmentation/congestion klaxons - your message might get out faster without them. " |
2006-12-13
|
05 | Brian Carpenter | [Ballot Position Update] New position, No Objection, has been recorded by Brian Carpenter |
2006-12-12
|
05 | Cullen Jennings | [Ballot discuss] I would like to talk about if the data in this draft supports the conclusions it is making. |
2006-12-12
|
05 | Cullen Jennings | [Ballot Position Update] New position, Discuss, has been recorded by Cullen Jennings |
2006-12-12
|
05 | Brian Carpenter | [Ballot comment] 5. Implications ... IPv6 is less vulnerable to this type of problem, since its fragment header contains a 32-bit identification field … [Ballot comment] 5. Implications ... IPv6 is less vulnerable to this type of problem, since its fragment header contains a 32-bit identification field [RFC2460]. Mis- association will only be a problem at packet rates 65536 times higher than for IPv4. Should note that IPv6 fragmentation only occurs e2e and there is no DF bit; hence errors caused by non-respect of the DF bit cannot occur. |
2006-12-11
|
05 | Russ Housley | [Ballot Position Update] New position, No Objection, has been recorded by Russ Housley |
2006-12-07
|
05 | Lars Eggert | [Ballot Position Update] New position, Yes, has been recorded for Lars Eggert |
2006-12-07
|
05 | Lars Eggert | Ballot has been issued by Lars Eggert |
2006-12-07
|
05 | Lars Eggert | Created "Approve" ballot |
2006-12-06
|
05 | Lars Eggert | Placed on agenda for telechat - 2006-12-14 by Lars Eggert |
2006-12-06
|
05 | Lars Eggert | State Changes to IESG Evaluation from Waiting for AD Go-Ahead::AD Followup by Lars Eggert |
2006-12-06
|
05 | Lars Eggert | Tentatively add to next telechat, pending verification by the reviewers that -03 addresses the issues found during IETF LC. |
2006-12-05
|
05 | (System) | Sub state has been changed to AD Follow up from New Id Needed |
2006-12-05
|
03 | (System) | New version available: draft-heffner-frag-harmful-03.txt |
2006-11-28
|
05 | Lars Eggert | State Changes to Waiting for AD Go-Ahead::Revised ID Needed from AD Evaluation::Revised ID Needed by Lars Eggert |
2006-11-28
|
05 | Lars Eggert | This should be in "Waiting for AD Go-Ahead" after LC, not in "AD Evaluation". |
2006-11-20
|
05 | Lars Eggert | State Changes to AD Evaluation::Revised ID Needed from AD Evaluation by Lars Eggert |
2006-11-20
|
05 | Lars Eggert | Revised ID needed to address LC comments. |
2006-11-20
|
05 | Lars Eggert | State Changes to AD Evaluation from Waiting for Writeup by Lars Eggert |
2006-11-17
|
05 | (System) | State has been changed to Waiting for Writeup from In Last Call by system |
2006-11-08
|
05 | (System) | Request for Last Call review by SECDIR Completed. Reviewer: Catherine Meadows. |
2006-11-08
|
05 | (System) | Requested Last Call review by SECDIR |
2006-10-26
|
05 | Yoshiko Fong | IANA Last Call Comment: As described in the IANA Considerations section, we understand this document to have NO IANA Actions |
2006-10-10
|
05 | Amy Vezza | Last call sent |
2006-10-10
|
05 | Amy Vezza | State Changes to In Last Call from Last Call Requested by Amy Vezza |
2006-10-10
|
05 | Lars Eggert | Lengthened LC due to overlap with IETF-67. |
2006-10-10
|
05 | Lars Eggert | Last Call was requested by Lars Eggert |
2006-10-10
|
05 | (System) | Ballot writeup text was added |
2006-10-10
|
05 | (System) | Last call text was added |
2006-10-10
|
05 | (System) | Ballot approval text was added |
2006-10-10
|
05 | Lars Eggert | State Changes to Last Call Requested from Publication Requested by Lars Eggert |
2006-10-10
|
05 | Lars Eggert | State Changes to Publication Requested from AD is watching by Lars Eggert |
2006-06-22
|
02 | (System) | New version available: draft-heffner-frag-harmful-02.txt |
2006-04-24
|
01 | (System) | New version available: draft-heffner-frag-harmful-01.txt |
2006-04-06
|
05 | Lars Eggert | Intended Status has been changed to Informational from None |
2006-04-03
|
05 | Lars Eggert | State Change Notice email list have been change to jheffner@psc.edu, mathis@psc.edu, bchandle@psc.edu from jheffner@psc.edu |
2006-04-03
|
05 | Lars Eggert | Area acronymn has been changed to tsv from gen |
2006-04-03
|
05 | Lars Eggert | Draft Added by Lars Eggert in state AD is watching |
2006-03-31
|
00 | (System) | New version available: draft-heffner-frag-harmful-00.txt |