Network Working Group F. Templin, Ed.
Internet-Draft Boeing Phantom Works
Intended status: Experimental May 11, 2007
Expires: November 12, 2007
Link Adaptation for IPv6-in-(foo)*-in-IPv4 Tunnels
draft-templin-linkadapt-06.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on November 12, 2007.
Copyright Notice
Copyright (C) The IETF Trust (2007).
Abstract
IPv6-in-(foo)*-in-IPv4 tunnels must support a minimum Maximum
Transmission Unit (MTU) of 1280 bytes for IPv6 via static
prearrangements and/or dynamic MTU determination based on ICMPv4
messages, but these methods have known operational limitations. This
document specifies a link adaptation mechanism for IPv6-in-(foo)*-in-
IPv4 tunnels that presents an assured MTU to the IPv6 layer using
tunnel endpoint-based segmentation/reassembly and dynamic segment
size probing.
Templin Expires November 12, 2007 [Page 1]
Internet-Draft Link Adaptation for Tunnels May 2007
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Tunnel MTU Assurance Methods and Issues . . . . . . . . . . . 4
4. Link Adaptation for IPv6-in-(foo)*-in-IPv4 Tunnels . . . . . . 4
4.1. Layering . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.2. Initial Negotiation Phase . . . . . . . . . . . . . . . . 5
4.3. Tunnel MTU and MRU . . . . . . . . . . . . . . . . . . . . 5
4.4. Ingress Tunnel Endpoint Specification . . . . . . . . . . 5
4.4.1. Segmentation and Encapsulation . . . . . . . . . . . . 6
4.4.2. IPv4 Fragmentation and Setting the DF Bit . . . . . . 8
4.4.3. Probing . . . . . . . . . . . . . . . . . . . . . . . 8
4.4.4. Processing Errors . . . . . . . . . . . . . . . . . . 9
4.5. Egress Tunnel Endpoint Specification . . . . . . . . . . . 10
4.5.1. Decapsulation and Reassembly . . . . . . . . . . . . . 10
4.5.2. Sending Errors . . . . . . . . . . . . . . . . . . . . 11
4.5.3. Sending Probe Replies . . . . . . . . . . . . . . . . 11
4.5.4. Active Reassembly Buffer Management . . . . . . . . . 12
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
6. Security Considerations . . . . . . . . . . . . . . . . . . . 12
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 12
8. Appendix A: Additional Considerations . . . . . . . . . . . . 12
9. Appendix B: Changes . . . . . . . . . . . . . . . . . . . . . 13
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
10.1. Normative References . . . . . . . . . . . . . . . . . . . 14
10.2. Informative References . . . . . . . . . . . . . . . . . . 15
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 16
Intellectual Property and Copyright Statements . . . . . . . . . . 17
Templin Expires November 12, 2007 [Page 2]
Internet-Draft Link Adaptation for Tunnels May 2007
1. Introduction
IPv6-in-(foo)*-in-IPv4 tunnels may span multiple IPv4 network hops
yet are seen by IPv6 as ordinary links that must support the minimum
IPv6 Maximum Transmission Unit (MTU) of 1280 bytes ([RFC2460],
Section 5). Common tunneling mechanisms (e.g.,
[RFC3056][RFC4213][RFC4214][RFC4380], etc.) meet this requirement
through conservative static prearrangements at the expense of
degraded performance over some paths due to excessive IPv4 network-
based fragmentation and/or missed opportunities to discover larger
MTUs. Optional dynamic MTU determination methods [RFC1191] are also
available, but may not provide adequate robustness.
This document specifies a link adaptation mechanism for IPv6-in-
(foo)*-in-IPv4 tunnels that presents an assured MTU to the IPv6
layer. It uses tunnel endpoint-based segmentation/reassembly and
dynamic segment size probing with authenticated probe feedback.
Thus, it provides greater robustness and efficiency by avoiding IPv4
network-based fragmentation and dependence on ICMPv4 feedback from
IPv4 network middleboxes.
2. Terminology
The following terms are defined within the scope of this document:
Upper Layer Payload (ULP)
a whole IPv6 packet, or a fragment packet created by IPv6
fragmentation.
Ingress Tunnel Endpoint (ITE)
the tunnel interface endpoint that accepts ULPs from the IP layer
and segments/packetizes them for transmission into a tunnel.
Egress Tunnel Endpoint (ETE)
the tunnel interface endpoint that receives packets from a tunnel
and de-packetizes/reassembles them into ULPs for delivery to the
IP layer.
IP Layer
the layer above the tunnel interface, i.e., the IPv6 layer.
Templin Expires November 12, 2007 [Page 3]
Internet-Draft Link Adaptation for Tunnels May 2007
Sub-IP Layer
any sublayers that occur within the tunnel interface, i.e., any
(foo)* layers and including the upper portion of the IPv4 layer.
Note that IPv4 is also viewed as the Layer 2 protocol from the
perspective of the tunnel, so the Sub-IP layer begins below the IP
layer and extends into Layer 2.
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
document, are to be interpreted as described in [RFC2119].
3. Tunnel MTU Assurance Methods and Issues
Common tunnel MTU assurance methods include classical IPv4
fragmentation [RFC0791], and IPv4/IPv6 Path MTU discovery
[RFC1191][RFC1981]. Other possibilities include operational
assurance of widely-deployed links with large MTUs. However, these
methods have well-known operational limitations that are well
documented [FRAG][I-D.heffner-frag-harmful][RFC2923][RFC4459].
This document specifies a link adaptation scheme for IPv6-in-(foo)*-
in-IPv4 tunnels that is distinct from the above alternatives and
avoids the issues. It entails segmentation at the ITR and reassembly
at the ETR at a logical mid-layer between IPv6 fragmentation and IPv4
fragmentation. It therefore resembles classical IPv4 fragmentation
but: 1) only allows fragmentation to occur at the ITE, 2) supports
path probing to detect the optimum segment size, and 3) avoids
sequence number wrapping and data integrity issues through careful
reassembly buffer management at the ETR. The scheme is specified in
the following sections:
4. Link Adaptation for IPv6-in-(foo)*-in-IPv4 Tunnels
The following subsections specify link adaptation mechanisms for
IPv6-in-(foo)*-in-IPv4 tunnels with properties similar to the link
adaptation mechanisms defined for AAL5 [RFC2684] and IEEE 802.11
[WLAN]:
4.1. Layering
IPv6-in-(foo)*-in-IPv4 tunnel endpoints operate at a logical midpoint
between the IPv6 and IPv4 protocol modules. From the viewpoint of
IPv6, the tunnel appears as an ordinary network interface module that
delivers whole IPv6 packets and IPv6 fragment packets as ULPs to and
from an underlying link. From the viewpoint of IPv4, the tunnel
Templin Expires November 12, 2007 [Page 4]
Internet-Draft Link Adaptation for Tunnels May 2007
appears as a packetization layer protocol that segments and
reassembles ULPs.
This document refers to the IPv6 layer as the "IP Layer" (i.e., layer
3) and any sublayers that occur within the tunnel interface (i.e.,
any (foo)* layers and including the upper portion of the IPv4 layer
itself) as the "Sub-IP layer". Note that IPv4 is also viewed as the
Layer 2 protocol from the perspective of the tunnel, so the Sub-IP
layer begins below the IP layer and extends into Layer 2. Note also
that (foo)* may entail multiple nested sublayers or may even be NULL,
i.e., in the case of IPv6-in-IPv4 tunnels.
4.2. Initial Negotiation Phase
IPv6-in-(foo)*-in-IPv4 tunnel endpoints MUST first determine that the
link adaptation mechanisms are implemented by both the ITE and ETE
through an initial negotiation phase specified outside the scope of
this document. ITEs/ETEs for which one or both ends of the tunnel do
not implement the scheme MUST use the default MTU assurance
mechanisms specified for the particular IPv6-in-(foo)*-in-IPv4
tunneling mechanism, and do not implement any other aspects of this
specification.
4.3. Tunnel MTU and MRU
ITEs MUST configure a minimum IPv6 link MTU of 1280 bytes for all
flows and SHOULD provide a configuration knob to set larger values.
A nominal per-flow MTU of 9180 bytes (i.e., the same as defined in
[RFC1626]) is RECOMMENDED, since it is large enough to accommodate
frame sizes as large as Gigabit Ethernet Jumbo Frames [GIGE]. ITEs
MAY set still larger MTU values, but are advised that this may lead
to excessive packet loss and ICMPv6 "packet too big" messages.
ETEs MUST configure a minimum per-flow Sub-IP layer reassembly buffer
size (i.e., a minimum Sub-IP layer Maximum Receive Unit (MRU)) of
1280 bytes, and SHOULD configure an MRU of 9180 bytes or larger to
accommodate the recommended nominal MTU for ITEs. A maximum MRU of
11454 bytes is RECOMMENDED, since 11454 bytes is the maximum packet
size for which a 32-bit CRC can provide Ethernet-quality bit error
detection [JAIN][AARNET]. ETEs MAY set still larger MRU values, but
are advised that larger values may lead to unacceptable levels of
undetected errors unless all physical segments in the path provide
assured error-free delivery for larger packets.
4.4. Ingress Tunnel Endpoint Specification
The following subsections specify mechanisms implemented by the ITE:
Templin Expires November 12, 2007 [Page 5]
Internet-Draft Link Adaptation for Tunnels May 2007
4.4.1. Segmentation and Encapsulation
ITEs maintain a per-flow MTU and per-flow segment size ("SEGSIZE")
for the purpose of segmenting ULPs that are too large to traverse the
tunnel. It is RECOMMENDED that ITEs configure an initial per-flow
SEGSIZE such that (SEGSIZE + length((foo)* headers) + length(IPv4
header)) yields an IPv4 datagram size between 256-576 bytes (since
256 bytes can safely accommodate the recommended nominal MTU (see
below), and since IPv4 nodes are only required to accept datagrams of
up to 576 bytes [RFC0791]). Since most IPv4 links in the Internet
configure still larger MTUs [RFC3150][RFC3819], and since IPv4 nodes
should accept packets as large as the underlying link MTU [RFC1122],
ITEs MAY use a still larger initial per-flow SEGSIZE if there is
assurance that it would not cause gratuitous IPv4 fragmentation
and/or overrun the IPv4 reassembly buffer. ITEs probe the path to
maintain SEGSIZE and/or discover larger SEGSIZEs during the lifetime
of a flow (see: Section 4.4.3.
ITEs split each ULP they send into a tunnel into chains of segments
for packetization and presentation to the IPv4 layer. For ULPs that
will span multiple segments, the ITE first uses the 2's compliment
Fletcher-32 checksum [STONE][RFC3385] to calculate a checksum across
the entire ULP, then appends the A and B results as a trailing 32-bit
checksum at the end of the ULP. For ULPs that fit within a single
segment, the ITE omits the trailing checksum.
The ITE next splits the ULP into a chain of consecutive segments that
MUST be created as contiguous and non-overlapping, i.e., the final
byte of the (i)th segment MUST be the byte that immediately precedes
the first byte of the (i+1)th segment. Non-final segments in the
chain MUST be identical in length and no larger than SEGSIZE bytes;
the final segment MAY be of different length. The ITE encapsulates
each segment in Sub-IP layer headers (including any (foo)* headers
and an IPv4 header) to form a chain of IPv4 packets; each packet in
the chain MUST include Sub-IP layer encapsulation headers of
identical length. The ITE sets the DF bit in the IPv4 header
according to the specification in Section 4.4.2, and encodes the
following information in the 16-bit IPv4 "Identification" field of
each segment:
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ULPID | SEGID |P|A|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
IPv4 Identification Field
Templin Expires November 12, 2007 [Page 6]
Internet-Draft Link Adaptation for Tunnels May 2007
ULPID: 8 bits
An identifying value assigned by the ITE to aid the ETE in
reassembling the segments of a ULP.
SEGID: 6 bits
A value that identifies a specific segment within a ULP.
P: 1 bit
Probe flag; 0 = Ordinary Segment, 1 = Probe Segment.
A: 1 bit
Additional Segments flag; 0 = Last Segment, 1 = Additional
Segments.
The ITE encodes an identical value in the "ULPID" field (bits 0 - 7
of the IPv4 Identification field) of each IPv4 packet in a chain to
identify the segments of a specific ULP; it encodes different ULPID
values in IPv4 packets that encapsulate segments of different ULPs.
The ITE also encodes an increasing Segment ID value between 0 - 62 in
the "SEGID" field (bits 8 - 13 of the IPv4 Identification field) of
consecutive packets in a chain, i.e., it encodes the value '0' in the
first packet, encodes the value '1' in the second packet, etc.
The ITE then sets the "Additional Segments - A" bit (bit 15 of the
IPv4 Identification field) in each packet in the chain except the
final one to indicate that additional segments follow. Finally, it
delivers each packet in the chain to the link layer (i.e., the IPv4
layer) in increasing SEGID order, i.e., SEGID 0 first, followed by
SEGID 1, etc., up to the final packet. The IPv4 layer SHOULD NOT
reorder the packets in a chain, but rather SHOULD deliver them to the
underlying link in the order in which the tunnel interface produced
them.
Note that IPv4 fragmentation in the network could theoretically
result in silent packet loss along certain paths even for packets
with the smallest recommended initial SEGSIZE (see: Section 4.4.2).
As such, a robust ITE implementation could reduce its IPv4 packet
sizes to as small as 68 bytes if it suspects that larger packets are
disappearing into a fragmentation-related black hole, but such small
packets might not satisfy the nominal tunnel MTU of 9180 bytes. ITEs
SHOULD therefore return locally-generated IPv6 "packet too big"
messages for IPv6 packets that cannot be segmented and encapsulated
within current IPv4 packet size and chain length limitations for the
tunnel.
Templin Expires November 12, 2007 [Page 7]
Internet-Draft Link Adaptation for Tunnels May 2007
4.4.2. IPv4 Fragmentation and Setting the DF Bit
When an ITE segments a ULP (see: Section 4.4.1), it can optionally
set or clear the "Don't Fragment - DF" bit in the encapsulating IPv4
headers of packets in the chain. If the DF bit is cleared,
gratuitous network-based IPv4 fragmentation could result in well-
known operational issues [FRAG] [I-D.heffner-frag-harmful]. Also,
some middleboxes (such as IPv4 NATs and firewalls) may only be
capable of passing the first fragment of a multi-fragment IPv4
datagram, and large multi-fragment datagrams could result in IPv4
reassembly buffer overruns. Finally, the minimum IPv4 MTU is only 68
bytes (i.e., the size required to encapsulate a maximum-length (60
byte) IPv4 header and a minimum-length (8 byte) fragment [RFC0791])
such that a limited amount of IPv4 fragmentation may occur in the
network even for relatively small packets.
Nonetheless, clearing the DF bit can in some circumstances increase
the packet delivery ratio when setting the DF bit would otherwise
result in excessive packet loss due to temporal link MTU
restrictions. In view of the above considerations, the ITE:
o SHOULD set the DF bit in probe packets (see: Section 4.4.3) larger
than 576 bytes.
o SHOULD set the DF bit in all packets larger than 576 bytes if it
will not perform active probing (see: Section 4.4.3).
o MAY clear the DF bit in any packets larger than 576 bytes if it
will perform active probing.
o MAY clear the DF bit in any packets of 576 bytes or smaller.
4.4.3. Probing
To increase efficiency and avoid excessive packet chain lengths, ITEs
SHOULD probe the path periodically to increase a flow's SEGSIZE to
larger values. ITEs probe a candidate SEGSIZE value 'N' by setting
the "Probe Segment - P" bit (bit 14 of the IPv4 Identification field)
in packets that encapsulate a probe segment of size N. For probe
segments that contain valid data for reassembly as part of a packet
chain, the ITE sets the appropriate SEGID value in the IPv4 packet
header as for ordinary segmentation. For probe segments that are to
be discarded by the ETE, the ITE sets the value 63 in the SEGID
field.
When the ITE sends a probe packet, it marks the probe as "pending"
for a period of 'MaxProbeDelay' msec (i.e., a per-flow round-trip
time estimate for the tunnel) and caches the probe packet's IPv4
Templin Expires November 12, 2007 [Page 8]
Internet-Draft Link Adaptation for Tunnels May 2007
destination, length and identification field values, as well as the
IPv6 flow label value [RFC3697]. If the ITE receives a valid Node
Information Query reply (NI Reply) [RFC4620] from the ETE (see:
Section 4.5.3) before the probe period expires, it marks the probe as
successful; otherwise, it marks the probe as failed. A valid NI
Reply MUST have:
o the Type, Code, Qtype and Flags fields set as specified for a NOOP
reply in ([RFC4620], Section 6.1), and
o the IPv4 length of the probe packet matches bits 0-15 of the Nonce
field, and
o the IPv4 identification of the probe packet matches bits 16-31 of
the Nonce field, and
o the IPv6 flow label value matches bits 32-51 of the Nonce field
Following a successful probe, but before advancing SEGSIZE to N, the
ITE SHOULD enter a brief verification phase during which it sends
additional probe segments to detect asymmetric multipath MTU
restrictions and/or route fluctuations. Thereafter, the ITE SHOULD
re-probe periodically to confirm that packets with up to SEGSIZE byte
segments are still reaching the ETE.
After probing the path to discover a new SEGSIZE, the ITE may elect
to set or clear the DF bit in subsequent non-probe packets (see:
Section 4.4.2). For example, the ITE may elect to clear the DF bit
to maintain an optimal packet delivery ratio across temporal link MTU
restrictions (e.g., due to dynamic rerouting of flows, etc.) while it
may elect to set the DF bit to avoid all IPv4 fragmentation in the
network.
ITEs that elect to clear the DF bit in non-probe packets SHOULD
engage in "active probing" to periodically confirm SEGSIZE
"frequently enough" such that cyclical misassociations and possible
data corruptions at the ETE do not occur [I-D.heffner-frag-harmful]
if a flow begins to fragment. ITEs that elect to set the DF bit in
non-probe packets SHOULD carefully consider any ICMPv4 "fragmentation
needed" messages that arrive (see: Section 4.4.4) but are advised
that packet delivery ratios may suffer when the flow transmission
rate is high and/or the path round trip time is large.
4.4.4. Processing Errors
ITEs may receive ICMPv4 "fragmentation needed" error messages from
middleboxes inside a tunnel, but are advised to consider them as
"soft errors". Implementers are advised to consult
Templin Expires November 12, 2007 [Page 9]
Internet-Draft Link Adaptation for Tunnels May 2007
[RFC1191][RFC2923][RFC4821] for operational recommendations on
processing ICMPv4 "fragmentation needed" messages.
ITEs may receive encapsulated ICMPv6 "packet too big" messages
[RFC1981] from an ETE at the far end of a tunnel (see:
Section 4.5.2). The ITE SHOULD cache the MTU value encoded in the
"packet too big" message as the new MTU for the flow, and relay the
ICMPv6 message back to the original source.
ITEs may receive encapsulated ICMPv6 "parameter problem" messages
with code "reassembly/checksum error" [RFC4443] from an ETE at the
far end of the tunnel (see: Section 4.5.2). This may indicate an
isolated packet splicing error at the ETE, or packet loss due to
temporal network conditions such as congestion, MTU restrictions,
link errors, signal intermittence, etc. If the ITE receives
persistent reassembly/checksum errors from an ETE, it SHOULD take
adaptive measures, e.g., reduce the SEGSIZE for the flow, rate-limit
the packets it sends into the tunnel, etc. Since each reassembly/
checksum error corresponds to a dropped packet, the ITE SHOULD relay
the messages back to the original source (subject to rate limiting).
4.5. Egress Tunnel Endpoint Specification
The following subsections specify mechanisms implemented by the ETE:
4.5.1. Decapsulation and Reassembly
The IPv4 length, ULPID, SEGID and A fields in the IPv4 packets in a
chain (along with the IPv6 flow label [RFC3697]) provide sufficient
information for the ETE to reassemble an original ULP with protection
for packet reordering in the network. ETEs MUST configure per-flow
reassembly buffers of at least 1280 bytes and SHOULD configure
reassembly buffers of 9180 bytes or larger to accommodate the nominal
tunnel MTU (see: Section 4.2). Note that these reassembly buffers
occur at the Sub-IP layer and are thus distinct from the IPv4 and
IPv6 reassembly caches.
ETEs use per-flow reassembly buffers to concatenate the segments
received in packet chains for a particular ULPID in increasing SEGID
order (i.e., SEGID 0, followed by SEGID 1, etc.) even if the packets
were re-ordered by the network. When all segments for a particular
ULPID have been concatenated into the reassembly buffer, the ETE uses
2's complement Fletcher-32 to verify the checksum if one was included
(see: Section 4.4.1). The ETE the discards the Sub-IP layer
encapsulation headers and trailing checksum, and delivers correctly-
reassembled ULPs to the IP layer (i.e., IPv6). It discards
incomplete ULPs and ULPs with incorrect checksums, and sends an
appropriate error message as specified in Section 4.5.2.
Templin Expires November 12, 2007 [Page 10]
Internet-Draft Link Adaptation for Tunnels May 2007
4.5.2. Sending Errors
If the ETE receives a packet chain that would overflow the reassembly
buffer, it discards the chain and sends an ICMPv6 "packet too big"
message [RFC1981] back to the IPv6 source via the reverse tunnel back
to the ITE. The ETE includes in the message body up to 1280 bytes
beginning with the upper layer packet headers (IPv6 and above) and
the contents of the reassembly buffer beyond the upper layer packet
headers; it encodes the size of the reassembly buffer in the MTU
value.
If the ETE receives at least one segment, but one or more segments
are lost and/or checksum verification fails, it SHOULD send an ICMPv6
"parameter problem" message with code "reassembly/checksum error"
[RFC4443] back to the IPv6 source via the reverse tunnel back to the
ITE. The ETE includes in the message body up to 1280 bytes beginning
with the upper layer packet headers (IPv6 and above) and contents of
the reassembly buffer beyond the upper layer packet headers, and sets
the pointer to either the beginning of the first missing segment or
the beginning of the 4 byte checksum field (if no segments were
missing).
After sending the error, the ITE discards the packet-in-error, i.e.,
it does not deliver the packet as an ULP to the IP layer.
4.5.3. Sending Probe Replies
If the ETE receives a segment used for probing (i.e., an IPv4 packet
in the chain with the 'P' flag set), it sends a Node Information
Query reply (NI Reply) [RFC4620] message back to the ITE. The ETE
MUST construct the NI Reply as follows:
o the Type, Code, Qtype and Flags fields set as specified for a NOOP
reply in ([RFC4620], Section 6.1), and
o the IPv4 length of the probe packet encoded in bits 0-15 of the
Nonce field, and
o the IPv4 identification of the probe packet encoded in bits 16-31
of the Nonce field, and
o the IPv6 flow label value encoded in bits 32-51 of the Nonce field
If the IPv4 packet containing the probe segment encodes the value 63
in the SEGID field, the ETE discards the segment; otherwise, it
includes the segment as part of the normal reassembly procedure
described above.
Templin Expires November 12, 2007 [Page 11]
Internet-Draft Link Adaptation for Tunnels May 2007
4.5.4. Active Reassembly Buffer Management
The ETE MUST actively manage reassembly buffers and discard as early
as possible any reassemblies that are not likely to complete due to,
e.g., loss of one or more packets in the chain, gross reordering of
packets in the network, etc. In particular, the ETE must discard
partial reassemblies before the 8-bit ULPID encoded by the ITE wraps.
The ETE therefore must augment the classical timer-driven reassembly
buffer management strategy with an event-driven strategy.
5. IANA Considerations
The IANA is instructed to assign a code type for "reassembly/checksum
error" under the ICMPv6 Parameter Problem message type in the "ICMPv6
Type Numbers" registry.
6. Security Considerations
The nonce values in NI Reply messages from ETEs provide spoofing
protection against off-path attackers.
7. Acknowledgments
This work has benefited from helpful discussions with many
colleagues, friends and family.
8. Appendix A: Additional Considerations
ITEs can use the probing mechanism described in Section 4.3 as a
general-purpose method for eliciting acknowledgements from an ETE if
improved reliability at the expense of additional overhead is
desired.
The equal size restriction for non-final segments and non-overlapping
restriction for all segments in packet chains provides a significant
simplification for reassembly algorithms [RFC0815].
Use of the link adaptation mechanisms specified in this document may
lead to an overall increase in short chains of small packets in the
Internet. Network administrators are advised to follow the
recommendations in [RFC3150] to minimize packet loss and packet
reordering. Also, overly-long packet chains should be avoided if
possible due to interactions with Active Queue Management (AQM) in
the network.
Templin Expires November 12, 2007 [Page 12]
Internet-Draft Link Adaptation for Tunnels May 2007
Since link-layer CRC-32 checks normally occur on each segment in the
path, most errors detected during ULP reassembly are due to packet
splices and/or errors in the data path between the NIC hardware and
the reassembly buffer. The Fletcher-32 checksum algorithm has been
shown to provide an effective edge-to-edge error detection capability
for such errors [STONE]. The Fletcher-32 checksum is also dissimilar
from both CRC-32 and the Internet checksum used by many upper layer
protocols, thereby decreasing the likelihood of undetected errors.
Some upper layer packetization protocols (e.g., NFS) may generate
fixed payload sizes and rely on the network layer to deliver the
payloads either as whole IP packets or as chains of IP fragments.
Since NFS performance (and the performance of other upper layer
packetization protocols) is sensitive to packet handling overhead,
implementations should periodically attempt to increase the SEGSIZE
through probing even if initial probe attempts fail.
9. Appendix B: Changes
(Note to RFC Editor - please remove this section before publishing as
an RFC.)
Changes since -05:
o Added back informative references to common tunneling mechanisms.
o Citation of RFC4459
Changes since -04:
o Rearranged sections for clarity.
o removed setting of IPv4 "Reserved Fragmentation", since ITE/ETE
capabilities can be discovered during the initial tunnel
negotiation.
Changes since -03:
o Clarified that mechanisms cover IPv6-in-(foo)-in-IPv4; not just
IPv6-in-IPv4.
o New terminology for ITE/ETE
o Clarifications to layering model
o Replaced RA with NI Reply as probe response
Templin Expires November 12, 2007 [Page 13]
Internet-Draft Link Adaptation for Tunnels May 2007
o Reduced SEGID to 6 bits and increased ULPID to 8 bits
o IPv6 flow label RFC cited
Changes since -01, -02:
o Updated references
Changes since -00:
o Defined new coding of segmentation/reassembly info in the IPv4
Identification field
o Changed "tunneling mechanism" to "tunnel endpoint"
o Clarified text on trailing checksums
o general document cleanup; removed "additional considerations" that
no longer apply
10. References
10.1. Normative References
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791,
September 1981.
[RFC1122] Braden, R., "Requirements for Internet Hosts -
Communication Layers", STD 3, RFC 1122, October 1989.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", RFC 2460, December 1998.
[RFC3697] Rajahalme, J., Conta, A., Carpenter, B., and S. Deering,
"IPv6 Flow Label Specification", RFC 3697, March 2004.
[RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control
Message Protocol (ICMPv6) for the Internet Protocol
Version 6 (IPv6) Specification", RFC 4443, March 2006.
[RFC4620] Crawford, M. and B. Haberman, "IPv6 Node Information
Queries", RFC 4620, August 2006.
Templin Expires November 12, 2007 [Page 14]
Internet-Draft Link Adaptation for Tunnels May 2007
10.2. Informative References
[AARNET] "AARNet: Network: Large MTU: Size, http://
www.aarnet.edu.au/engineering/networkdesign/mtu/
size.html", April 2007.
[FRAG] Mogul, J. and C. Kent, "Fragmentation Considered Harmful,
In Proc. SIGCOMM '87 Workshop on Frontiers in Computer
Communications Technology.", August 1987.
[GIGE] Dykstra, P., "Gigabit Ethernet Jumboframes (And Why You
Should Care), http://sd.wareonearth.com/~phil/jumbo.html",
December 1999.
[I-D.heffner-frag-harmful]
Heffner, J., "IPv4 Reassembly Errors at High Data Rates",
draft-heffner-frag-harmful-05 (work in progress),
May 2007.
[JAIN] Jain, R., "Error Characteristics of Fiber Distributed Data
Interface (FDDI),
http://www.cse.wustl.edu/~jain/papers.html", August 1990.
[RFC0815] Clark, D., "IP datagram reassembly algorithms", RFC 815,
July 1982.
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
November 1990.
[RFC1626] Atkinson, R., "Default IP MTU for use over ATM AAL5",
RFC 1626, May 1994.
[RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
for IP version 6", RFC 1981, August 1996.
[RFC2684] Grossman, D. and J. Heinanen, "Multiprotocol Encapsulation
over ATM Adaptation Layer 5", RFC 2684, September 1999.
[RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery",
RFC 2923, September 2000.
[RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains
via IPv4 Clouds", RFC 3056, February 2001.
[RFC3150] Dawkins, S., Montenegro, G., Kojo, M., and V. Magret,
"End-to-end Performance Implications of Slow Links",
BCP 48, RFC 3150, July 2001.
Templin Expires November 12, 2007 [Page 15]
Internet-Draft Link Adaptation for Tunnels May 2007
[RFC3385] Sheinwald, D., Satran, J., Thaler, P., and V. Cavanna,
"Internet Protocol Small Computer System Interface (iSCSI)
Cyclic Redundancy Check (CRC)/Checksum Considerations",
RFC 3385, September 2002.
[RFC3819] Karn, P., Bormann, C., Fairhurst, G., Grossman, D.,
Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
Wood, "Advice for Internet Subnetwork Designers", BCP 89,
RFC 3819, July 2004.
[RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms
for IPv6 Hosts and Routers", RFC 4213, October 2005.
[RFC4214] Templin, F., Gleeson, T., Talwar, M., and D. Thaler,
"Intra-Site Automatic Tunnel Addressing Protocol
(ISATAP)", RFC 4214, October 2005.
[RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through
Network Address Translations (NATs)", RFC 4380,
February 2006.
[RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the-
Network Tunneling", RFC 4459, April 2006.
[RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU
Discovery", RFC 4821, March 2007.
[STONE] Stone, J., "Checksums in the Internet (Stanford Doctoral
Dissertation)", August 2001.
[WLAN] Society, I., "Part 11: Wireless LAN Medium Access Control
(MAC) and Physical Layer (PHY) Specifications, IEEE
Computer Society, ANSI/IEEE 802.11, 1999 Edition.".
Author's Address
Fred L. Templin (editor)
Boeing Phantom Works
P.O. Box 3707
Seattle, WA 98124
USA
Email: fred.l.templin@boeing.com
Templin Expires November 12, 2007 [Page 16]
Internet-Draft Link Adaptation for Tunnels May 2007
Full Copyright Statement
Copyright (C) The IETF Trust (2007).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Templin Expires November 12, 2007 [Page 17]