Network Working Group J. Heffner
Internet-Draft M. Mathis
Expires: October 23, 2006 B. Chandler
PSC
April 21, 2006
Fragmentation Considered Very Harmful
draft-heffner-frag-harmful-01
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on October 23, 2006.
Copyright Notice
Copyright (C) The Internet Society (2006).
Abstract
IPv4 fragmentation is not sufficiently robust for general use in
today's Internet. The 16-bit IP identification field is not large
enough to prevent frequent incorrectly assembled IP fragments, and
the TCP and UDP checksums are insufficient to prevent the resulting
corrupted datagrams from being delivered to higher protocol layers.
This note describes some easily reproduced experiments demonstrating
the problem, and discusses some of the operational implications of
Heffner, et al. Expires October 23, 2006 [Page 1]
Internet-Draft Fragmentation Considered Very Harmful April 2006
these observations.
1. Introduction
The IPv4 header was designed at a time when data rates were several
orders of magnitude lower than those achievable today. This document
describes a consequent scale-related failure in the IP identification
(ID) field, where fragments may be incorrectly assembled at a rate
high enough likely to invalidate assumptions about data integrity
failure rates.
That IP fragmentation results in inefficient use of the network has
been well documented [Kent87]. This note presents a different kind
of problem, which can result not only in significant performance
degradation, but also frequent data corruption. This is especially
pertinent due to the recent proliferation of UDP bulk transport tools
that sometimes fragment every datagram. Additionally, there is some
network equipment that ignores the Don't Fragment (DF) bit in the IP
header to work around MTU discovery problems [RFC2923]. This
equipment indirectly exposes properly implemented protocols and
applications to corrupt data.
2. Wrapping the IP ID Field
The Internet Protocol standard specifies:
"The choice of the Identifier for a datagram is based on the need
to provide a way to uniquely identify the fragments of a
particular datagram. The protocol module assembling fragments
judges fragments to belong to the same datagram if they have the
same source, destination, protocol, and Identifier. Thus, the
sender must choose the Identifier to be unique for this source,
destination pair and protocol for the time the datagram (or any
fragment of it) could be alive in the Internet." [RFC0791]
Strict conformance to this standard limits transmissions in one
direction between any address pair to no more than 65536 packets per
protocol (e.g. TCP, UDP or ICMP) per maximum packet lifetime.
Clearly not all hosts will follow this standard, because it implies
an unreasonably low maximum data rate. For example, a host sending
1500 byte packets with a 30 second maximum packet lifetime could send
at only about 26 Mbits/s before exceeding 65535 packets per packet
lifetime. Or, filling a 1 Gbit/s interface with 1500 byte packets
requires sending 65536 packets in less than 1 second, an unreasonably
short maximum packet lifetime, being less than the round-trip time on
Heffner, et al. Expires October 23, 2006 [Page 2]
Internet-Draft Fragmentation Considered Very Harmful April 2006
some paths. This requirement is widely ignored.
IP receivers store fragments in a reassembly buffer until all
fragments in a datagram arrive, or until the reassembly timeout
expires (15 seconds is suggested in [RFC0791]). Fragments in a
datagram are associated with each other by the value in their ID
field, and by the source, destination address pair. If a sender
wraps the ID field in less than the reassembly timeout, it becomes
possible for fragments from different datagrams to be incorrectly
spliced together ("mis-associated"), and delivered to the upper layer
protocol.
A case of particular concern is when mis-association is self-
propagating. This occurs, for example, when there is reliable
ordering of packets and the first fragment of a datagram is lost in
the network. The rest of the fragments are stored in the fragment
reassembly buffer, and when the sender wraps the ID field, the first
fragment of the new datagram will be mis-associated with the rest of
the old datagram. The new datagram will be now be incomplete (since
it is missing its first fragment), so the rest of it will be saved in
the fragment reassembly buffer, forming a cycle that repeats every
65536 datagrams. It is possible to have a number of simultaneous
cycles, bounded by the size of the fragment reassembly buffer.
3. Harmful Effects of Mis-Associated Fragments
When the mis-associated fragments are delivered, transport-layer
checksumming should detect these datagrams as incorrect and discard
them. When the datagrams are discarded, it could pose a problem for
loss-feedback congestion control algorithms since there will be a
high number of non-congestion-related losses.
However, transport checksums may not be designed to handle such high
error rates, either. The TCP/UDP checksum is only 16 bits in length.
If these checksums follow a uniform random distribution, we expect
mis-associated datagrams to be accepted by the checksum at a rate of
one per 65536. With only one mis-association cycle, we expect
corrupt data delivered to the application layer once per 2^32
datagrams. This number can be significantly higher with multiple
cycles.
With non-random data, the TCP/UDP checksum may be even weaker still.
It is possible to construct datasets where mis-associated fragments
will always have the same checksum. Such a case may be considered
unlikely, but is worth considering. "Real" data may be more likely
than random data to cause checksum hot spots and increase the
probability of false checksum match [Stone98]. Also, some
Heffner, et al. Expires October 23, 2006 [Page 3]
Internet-Draft Fragmentation Considered Very Harmful April 2006
applications may turn off checksumming to increase speed, though this
practice has been found to be dangerous for other reasons [Stone00].
4. Experimental Observations
To test the practical impact of fragmentation on UDP, we ran a series
of experiments using a UDP bulk data transport protocol that was
designed to be used as an alternative to TCP for transporting large
data sets over specialized networks. The tool, Reliable Blast UDP
(RBUDP), part of the QUANTA networking toolkit [QUANTA], was selected
because it has a clean interface which facilitated automated
experiments. The decision to use RBUDP had little to do with the
details of the transport protocol itself. Any UDP transport protocol
that does not have additional means to detect corruption, and that
could be configured to use IP fragmentation, would have the same
results.
In order to diagnose corruption on files transferred with the UDP
bulk transfer tool, we used a file format that included embedded
sequence numbers and MD5 checksums in each fragment of each datagram.
Thus it was possible to distinguish random corruption from that
caused by mis-associated fragments. We used two different types of
files. One was constructed so that all the UDP checksums were
constant -- we will call this the "constant" dataset. The other was
constructed so that UDP checksums were uniformly random -- the
"random" dataset. All tests were done using 400 MB files.
The UDP bulk file transport tool was used to send the datasets
between a pair of hosts at slightly less than the available data rate
(100 Mbps). Near the beginning of each flow, a brief secondary flow
was started to induce packet loss in the primary flow. Throughout
the life of the primary flow, we typically observed mis-association
rates on the order of a few hundredths of a percent.
Tests run with the "constant" dataset resulted in corruption on all
mis-associated fragments, that is, corruption on the order of a few
hundredths of a percent. In sending approximately 10 TB of "random"
datasets, we observed 8847668 UDP checksum errors and 121 corruptions
of the data due to mis-associated fragments.
5. Implications
Most TCP implementations today participate in MTU discovery
[RFC1191], which will avoid the problems described in this note by
avoiding IP fragmentation altogether. However, as a work-around for
MTU discovery problems [RFC2923], some TCP implementations and
Heffner, et al. Expires October 23, 2006 [Page 4]
Internet-Draft Fragmentation Considered Very Harmful April 2006
communications gear provide mechanisms to disable path MTU discovery
by clearing or ignoring the DF bit. Doing so will expose all
protocols using IPv4, even those which participate in MTU discovery,
to mis-association errors.
IPv6 is less vulnerable to this type of problem, since its fragment
header contains a 32-bit identification field [RFC2460]. Mis-
association will only be a problem at packet rates 65536 times higher
than for IPv4.
Since mis-association of fragments will only occur when the IP ID
field is wrapped within the fragment reassembly timeout, it may be
possible to reduce the timeout sufficiently so that mis-association
will not occur. However, there are a number of difficulties with
such an approach. Since the sender controls the rate of packets sent
and selection of IP ID, while the receiver controls the reassembly
timeout, there would need to be some mutual assurance between each
party as to participation in the scheme. Further, it is not
generally possible to set the timeout low enough so that a fast
sender's fragments will not be mis-associated, yet high enough so
that a slow sender's fragments will not be unconditionally discarded
before it is possible to reassemble them. So the timeout and IP ID
selection would need to be done on a per peer basis. Also, it is
likely NAT will break any per peer tables keyed by IP address. It is
not within the scope of this document to recommend solutions to these
problems.
Another means of solving the corruption issue is to add stronger
integrity checking, which can be done at any layer above IP. This is
a natural side effect of using cryptographic authentication. If
IPsec AH [RFC2402] is in use, the mis-associated fragments will be
discarded at the network layer with extremely high probability. Some
higher layers may use longer checksums (for example, SCTP's is 32
bits in length [RFC2960]) or cryptographic authentication (SSH
message authentication codes [RFC4251]). While stronger integrity
checking may prevent data corruption, it will not solve the problem
of a high effective loss rate. In the case of SSH, any stream
corruption results in immediate termination of the connection.
6. Security Considerations
If a malicious entity knows that a pair of hosts are communicating
using a fragmented stream, it may present an opportunity for this
entity to corrupt the flow. By sending "high" fragments (those with
offset greater than zero) with a forged source address, the attacker
can deliberately cause corruption as described above. Exploiting
this vulnerability requires only knowledge of the source and
Heffner, et al. Expires October 23, 2006 [Page 5]
Internet-Draft Fragmentation Considered Very Harmful April 2006
destination addresses of the flow, and fragment boundaries. It does
not require knowledge of port or sequence numbers.
If the attacker has visibility of packets on the path, the attack
profile is similar to injecting full segments. Using this attack
makes blind disruptions easier, and could certainly be used
effectively to cause denial of service. However, only streams using
IPv4 fragmentation are vulnerable. Because of the nature of the
problems outlined in this draft, the use of IPv4 fragmentation for
critical applications may not be advisable regardless of security
concerns.
7. References
[Kent87] Kent, C. and J. Mogul, "Fragmentation considered harmful",
Proc. SIGCOMM '87 vol. 17, No. 5, October 1987.
[RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery",
RFC 2923, September 2000.
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791,
September 1981.
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
November 1990.
[Stone98] Stone, J., Greenwald, M., Partridge, C., and J. Hughes,
"Performance of Checksums and CRC's over Real Data", IEEE/
ACM Transactions on Networking vol. 6, No. 5,
October 1998.
[Stone00] Stone, J. and C. Partridge, "When The CRC and TCP Checksum
Disagree", Proc. SIGCOMM 2000 vol. 30, No. 4,
October 2000.
[QUANTA] He, E., Alimohideen, J., Eliason, J., Krishnaprasad, N.,
Leigh, J., Yu, O., and T. DeFanti, "Quanta: a toolkit for
high performance data delivery over photonic networks",
Future Generation Computer Systems Vol. 19, No. 6,
August 2003.
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", RFC 2460, December 1998.
[RFC2960] Stewart, R., Xie, Q., Morneault, K., Sharp, C.,
Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M.,
Zhang, L., and V. Paxson, "Stream Control Transmission
Protocol", RFC 2960, October 2000.
Heffner, et al. Expires October 23, 2006 [Page 6]
Internet-Draft Fragmentation Considered Very Harmful April 2006
[RFC2402] Kent, S. and R. Atkinson, "IP Authentication Header",
RFC 2402, November 1998.
[RFC4251] Ylonen, T. and C. Lonvick, "The Secure Shell (SSH)
Protocol Architecture", RFC 4251, January 2006.
Appendix A. Acknowledgements
This work was supported by the National Science Foundation under
Grant No. 0083285.
Heffner, et al. Expires October 23, 2006 [Page 7]
Internet-Draft Fragmentation Considered Very Harmful April 2006
Authors' Addresses
John W. Heffner
Pittsburgh Supercomputing Center
4400 Fifth Avenue
Pittsburgh, PA 15213
US
Phone: 412-268-2329
Email: jheffner@psc.edu
Matt Mathis
Pittsburgh Supercomputing Center
4400 Fifth Avenue
Pittsburgh, PA 15213
US
Phone: 412-268-3319
Email: mathis@psc.edu
Ben Chandler
Pittsburgh Supercomputing Center
4400 Fifth Avenue
Pittsburgh, PA 15213
US
Phone: 412-268-9783
Email: bchandle@psc.edu
Heffner, et al. Expires October 23, 2006 [Page 8]
Internet-Draft Fragmentation Considered Very Harmful April 2006
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2006). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
Heffner, et al. Expires October 23, 2006 [Page 9]