[Search] [txt|pdf|bibtex] [Tracker] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 04 05 06                                          
Network Working Group                                    F. Templin, Ed.
Internet-Draft                                      Boeing Phantom Works
Intended status: Informational                        September 25, 2007
Expires: March 28, 2008


      Packetization Layer Path MTU Discovery for IP/*/IPv4 Tunnels
                      draft-templin-inetmtu-01.txt

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on March 28, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2007).

Abstract

   The nominal Maximum Transmission Unit (MTU) of the Internet has
   become 1500 bytes, but existing IP/*/IPv4 tunneling mechanisms impose
   an encapsulation overhead that can reduce the effective path MTU to
   smaller values.  Additionally, existing IP/*/IPv4 tunneling
   mechanisms are limited in their ability to discover and utilize
   larger MTUs.  This document specifies new mechanisms for conveying
   packets over IP/*/IPv4 tunnels that address these issues.




Templin                  Expires March 28, 2008                 [Page 1]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Concept of Operation . . . . . . . . . . . . . . . . . . . . .  4
   4.  Tunnel MTU and MRU . . . . . . . . . . . . . . . . . . . . . .  4
   5.  Tunnel Soft State  . . . . . . . . . . . . . . . . . . . . . .  5
   6.  Sending Packets  . . . . . . . . . . . . . . . . . . . . . . .  5
     6.1.  Conceptual Sending Algorithm . . . . . . . . . . . . . . .  6
     6.2.  Inner packet Fragmentation . . . . . . . . . . . . . . . .  7
     6.3.  Encapsulation  . . . . . . . . . . . . . . . . . . . . . .  7
       6.3.1.  Footer . . . . . . . . . . . . . . . . . . . . . . . .  7
       6.3.2.  Trailing Data and Checksum . . . . . . . . . . . . . .  8
       6.3.3.  Data, Probe Request, and Probe Solicitation Format . .  9
       6.3.4.  Probe Reply Format . . . . . . . . . . . . . . . . . .  9
     6.4.  Outer Packet Fragmentation . . . . . . . . . . . . . . . . 11
     6.5.  Setting DF in the Outer Header . . . . . . . . . . . . . . 11
     6.6.  Window Management  . . . . . . . . . . . . . . . . . . . . 11
   7.  Receiving Packets  . . . . . . . . . . . . . . . . . . . . . . 11
     7.1.  Decapsulation  . . . . . . . . . . . . . . . . . . . . . . 11
     7.2.  Receiving Packet Too Big (PTB) Errors  . . . . . . . . . . 12
   8.  Tunnel Qualification and Soft State Management . . . . . . . . 12
     8.1.  Probe Requests . . . . . . . . . . . . . . . . . . . . . . 12
       8.1.1.  Sending Probe Requests . . . . . . . . . . . . . . . . 12
       8.1.2.  Receiving Probe Requests . . . . . . . . . . . . . . . 13
     8.2.  Probe Solicitations  . . . . . . . . . . . . . . . . . . . 13
       8.2.1.  Sending Probe Solicitations  . . . . . . . . . . . . . 13
       8.2.2.  Receiving Probe Solicitations  . . . . . . . . . . . . 14
     8.3.  Probe Replies  . . . . . . . . . . . . . . . . . . . . . . 14
       8.3.1.  Sending Probe Replies  . . . . . . . . . . . . . . . . 14
       8.3.2.  Receiving Probe Replies  . . . . . . . . . . . . . . . 15
   9.  8-bit Fletcher Checksum Calculation  . . . . . . . . . . . . . 16
   10. Updated Specifications . . . . . . . . . . . . . . . . . . . . 16
   11. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 17
   12. Security Considerations  . . . . . . . . . . . . . . . . . . . 17
   13. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 18
   14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18
     14.1. Normative References . . . . . . . . . . . . . . . . . . . 18
     14.2. Informative References . . . . . . . . . . . . . . . . . . 18
   Appendix A.  Discussion  . . . . . . . . . . . . . . . . . . . . . 19
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 20
   Intellectual Property and Copyright Statements . . . . . . . . . . 21









Templin                  Expires March 28, 2008                 [Page 2]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


1.  Introduction

   The nominal Maximum Transmission Unit (MTU) of today's Internet has
   become 1500 bytes due to the preponderance of networking gear that
   configures an MTU of that size.  Since not all links in the Internet
   configure a 1500 byte MTU, however [RFC3819], packets can be dropped
   due to an MTU restriction on the path.  Internet Protocol, Version 4
   (IPv4) [RFC0791] is the predominant network layer protocol in the
   Internet today, and it is likely that IPv4 use will continue to grow
   into the future.  It is therefore essential that tunnels over IPv4
   (hereafter called IP/*/IPv4 tunnels) be made capable of consistent
   and efficient handling of packets of various sizes.

   Upper layers see IP/*/IPv4 tunnels as ordinary links, but even for
   packets no larger than 1500 bytes these links are susceptible to
   silent loss (e.g., due to path MTU restrictions, lost error messages,
   layered encapsulations, reassembly buffer limitations, etc.)
   resulting in poor performance and/or communications failures
   [RFC2923][RFC4459][RFC4821][RFC4963].

   This document specifies new mechanisms for IP/*/IPv4 tunnels that
   assure robust handling for packets of various sizes; it updates the
   functional specifications for Tunnel Endpoints (TEs) found in
   existing IP/*/IPv4 tunneling mechanisms (see: Section 10).


2.  Terminology

   The following abbreviations and terms are used in this document:

      DF - the IPv4 header "Don't Fragment" flag ([RFC0791], Section
      3.1).

      EMTU_R - Effective MTU to Receive ([RFC1122], Section 3.3.2).

      ENCAPS - the size of the encapsulating */IPv4 headers plus
      trailers.

      IPv4 - Internet Protocol, Version 4

      IPv6 - Internet Protocol, Version 6

      MaxOuterPktLen - Maximum Outer Packet Length, in bytes

      MaxInnerPktLen - Maximum Inner Packet Length, in bytes

      ReassTime - Reassembly Timeout




Templin                  Expires March 28, 2008                 [Page 3]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


      MRU - Maximum Receive Unit.  For the purpose of this document,
      'MRU' has exactly the same meaning as 'EMTU_R'

      MTU - Maximum Transmission Unit

      PTB - Packet Too Big error

      TE - Tunnel Endpoint

      TFE - Tunnel Far End

      TNE - Tunnel Near End

      IP/*/IPv4 - an IP packet encapsulated in */IPv4 headers (e.g. for
      "*" = NULL, UDP, TCP, AH, ESP, etc.).

      inner packet/header/payload - an IP packet/header/payload before
      IP/*/IPv4 encapsulation.

      outer packet/header/payload - a */IPv4 packet/header/payload after
      IP/*/IPv4 encapsulation.


3.  Concept of Operation

   TEs that implement this scheme engage in a continuous handshaking
   process while data is flowing through the tunnel to confirm that the
   TFE is participating and to maintain soft state used for determining
   maximum packet sizes.  When the flow of data through the tunnel is
   suspended, the handshaking process is discontinued.  When one or both
   of the TEs do not implement the scheme, the behavior automatically
   reverts to that of the legacy IP/*/IPv4 tunneling mechanism.


4.  Tunnel MTU and MRU

   TEs configure an indefinite MTU on the tunnel interface, i.e., there
   is no logical limit on the size of inner packets that upper layers
   can present to the tunnel interface.

   TEs MUST configure an MRU (i.e., an EMTU_R) that is no smaller than
   2048 bytes (2KB) on all IPv4 interfaces over which a tunnel interface
   is configured.  Additionally, they MUST configure an MRU that is no
   smaller than 2KB on the tunnel interface, and SHOULD configure an MRU
   that is no smaller than the largest MRU of any IPv4 interfaces over
   which the tunnel is configured.





Templin                  Expires March 28, 2008                 [Page 4]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


5.  Tunnel Soft State

   TEs maintain the following per-TFE conceptual variables as soft state
   (e.g., in a conceptual neighbor cache):

   MaxOuterPktLen
      the current maximum length outer packet/fragment that can be
      accommodated by the IPv4 path MTU without further fragmentation.
      Recommended default value: 128 bytes.  Range: 68 bytes to 64KB.

   MaxInnerPktLen
      the current maximum length inner packet/fragment that the TFE can
      reassemble over the tunnel, i.e., the MRU.  Recommended default
      value: the minimum MRU defined for the specific IP/*/IPv4
      tunneling mechanism (e.g., 1500 bytes for [RFC4213]).  Range: 576
      bytes to (2^32-1) bytes.

   ReassTime
      the current timeout value that the TFE uses for reassembly of
      fragmented packets that traverse the tunnel.  Recommended default
      value: 120 seconds.  Range: 4uSec to 4*(2^32)usec (~4.55hr).

   IPv4Id
      the current IPv4 ID value that the TE will assign in the outer
      IPv4 header of packets it sends into the tunnel.  Initial value:
      randomly chosen.  Range: 0 to 2^16-1.

   isQualified
      boolean indicating whether the TFE implements the scheme.
      Recommended default value: FALSE.

   isNAT
      boolean indicating whether there is an IPv4 Network Address
      Translator (NAT) on the path to the TFE.  Default value: TRUE or
      FALSE, based on the specific IP/*/IPv4 tunneling mechanism.

   See: [RFC3819], Section 2 for subnetwork MTU recommendations that
   influence 'MaxOuterPktLen'.

   See: [RFC1122], Section 3.3.2 for EMTU_R (MRU) and reassembly timeout
   recommendations.


6.  Sending Packets

   TEs send packets across a tunnel to the TFE according to the
   following specifications:




Templin                  Expires March 28, 2008                 [Page 5]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


6.1.  Conceptual Sending Algorithm

   With reference to Sections 6.2 - 6.6, TEs use the following
   conceptual sending algorithm:

        if inner packet is larger than 'MaxInnerPktLen' and inner
          packet is not fragmentable (see: Section 6.2)
            Send PTB appropriate to the inner protocol (e.g., an
            ICMPv6 PTB [RFC1981]) with MTU = 'MaxInnerPktLen'.
            Drop packet.
        else
            if 'isNAT' and inner packet is not a probe used for
              'MaxOuterPktLen' determination
                if inner packet is larger than 2*('MaxOuterPktLen'
                  - ENCAPS) and inner packet is not fragmentable
                 (see: Section 6.2)
                    Send PTB appropriate to the inner protocol
                    with MTU = 2*('MaxOuterPktLen' - ENCAPS)).
                    Drop packet.
                else
                    Fragment inner packet into fragments no larger
                    than MIN('MaxInnerPktLen', 2*('MaxOuterPktLen'
                    - ENCAPS)) (see: Section 6.2).
                endif
            else
                Fragment inner packet into fragments no larger than
                'MaxInnerPktLen' (see: Section 6.2).
            endif
            foreach inner packet/fragment
                Encapsulate as an outer IPv4 packet (see: Section 6.3).
                if outer packet is not a probe used for
                  'MaxOuterPktLen' determination
                    fragment outer packet into fragments no larger than
                    'MaxOuterPktLen' (see: Section 6.4).
                endif
                foreach outer packet/fragment
                    Set DF in the outer header according to Section 6.5.
                    Send fragment subject to window restrictions
                    (see: Section 6.6).
                endforeach
            endforeach
        endif

                  Figure 1: Conceptual Sending Algorithm







Templin                  Expires March 28, 2008                 [Page 6]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


6.2.  Inner packet Fragmentation

   An inner packet is fragmentable IFF the TE is permitted to break it
   into inner fragments before encapsulation, e.g., an IPv6 packet with
   a fragment header, an IPv4 packet with DF=0, etc.

   TEs break fragmentable inner packets into inner fragments of no more
   than 'MaxInnerPktLen' bytes when 'isNAT' is FALSE and no more than
   MIN('MaxInnerPktLen', 2*('MaxOuterPktLen' - ENCAPS)) bytes when
   'isNAT' is TRUE.  The TE then encapsulates each inner fragment per
   Section 6.3.  These inner fragments will be reassembled by the final
   destination.

   When 'isNAT' is TRUE, 2*('MaxOuterPktLen' - ENCAPS) may not be large
   enough to accommodate the minimum IPv6 MTU such that the TE may be
   required to drop an IPv6 packet of 1280 bytes or smaller and send an
   ICMPv6 PTB with an MTU value less than 1280 bytes.  The original IPv6
   source will then include a fragment header in subsequent IPv6 packets
   and the TE can then perform IPv6 fragmentation on these inner packets
   using the fragment header included by the source according to the
   final paragraph of [RFC2460], Section 5.

6.3.  Encapsulation

   TEs encapsulate inner IP packets according to the specific IP/*/IPv4
   document, except that the TE maintains a randomly-initialized and
   monotonically-increasing (modulo 64K) per-TFE 'IPv4Id' value that it
   encodes in the outer IPv4 headers of successive encapsulated packets.

   The TE also appends trailing data as specified in the following
   sections and increments the innermost '*' header length field by the
   number of trailing data bytes added, e.g., the UDP length field for
   IPv6/UDP/IPv4 tunnels, the IPv4 length field for IPv6/IPv4 tunnels,
   etc.

6.3.1.  Footer

   When trailing data is included (see Section 6.3.2), the TE adds the
   following 4-byte footer as the final 4 bytes of the trailing data.
   The footer is byte-aligned only, and need not be aligned on an even
   word/longword/etc. boundary:










Templin                  Expires March 28, 2008                 [Page 7]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |Version| Type  |   Reserved    | Fletcher A    |  Fletcher B   |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                          Figure 2: Footer Format

   where the fields of the footer are specified as follows:

   Version (4 bits)
      The Version field indicates the format of the trailing data.  This
      document describes version 1.

   Type (4 bits)
      The type of encapsulated packet.  The following types are defined:

      0 - Ordinary data packet.

      1 - Probe Request (see: Section 8.1).

      2 - Probe Solicitation (see: Section 8.2).

      3 - Probe Reply (see: Section 8.3).

      4 - 15 - Reserved for future use.

   Reserved (8 bits)
      Reserved for future use.

   Fletcher A (8 bits)
      The 8-bit Fletcher A checksum component.

   Fletcher B (8 bits)
      The 8-bit Fletcher B checksum component.

6.3.2.  Trailing Data and Checksum

   The TE MUST include trailing data with a non-zero checksum in the
   footer of all probe request/reply/solicit packets, and MUST include
   trailing data with a non-zero checksum in the footer of data packets
   when 'isNAT' is TRUE.  The TE MAY include trailing data with either a
   zero or non-zero checksum in data packets when 'isNAT' is FALSE, or
   MAY alternately omit trailing data in those packets.

   For probe reply packets, the TE appends zero-filled padding bytes as
   necessary to extend the packet to a minimum of 50 bytes beyond the
   beginning of the inner IP header then appends a 14 byte control block



Templin                  Expires March 28, 2008                 [Page 8]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


   as specified in Section 6.3.4.

   For all other packets that will include a non-zero trailing checksum,
   the TE appends zero-filled padding bytes as necessary to extend the
   packet to a minimum of 64 bytes beyond the beginning of the inner IP
   header.

   The TE then calculates the 8-bit Fletcher checksum as specified in
   Section 9 and encodes the results in the Fletcher A and B fields of
   the footer.  The footer is appended as the final 4 bytes of the
   trailing data, as specified in the following sections.

6.3.3.  Data, Probe Request, and Probe Solicitation Format

   The TE uses the following packet format for data, probe request, and
   probe solicitation packets (types 0 through 2):
                          +---------------------------------+
                          |           Outer IPv4            |
                          |        Header w/'IPv4Id'        |
                          +---------------------------------+
                          |            * Headers            |
                          |                                 |
   +-------------+        +---------------------------------+
   |   Inner IP  |        |             Inner IP            |
   ~   packet    ~  ===>  ~             packet              ~
   |             |        |                                 |     T
   +-------------+        +---------------------------------+ -\  r
    Inner Packet          |                                 |  |  a
                          ~           Zero Padding          ~  |  i
                          |                                 |   > l
                          +---------------------------------+  |  e
                          |   Footer (see: Section 6.3.1)   |  |  r
                          +---------------------------------+ -/  s
                          |  Any */IPv4 protocol trailers ...
                          +------------------------------
                              Outer Packet with Trailers

       Figure 3: Data, Probe Request, and Probe Solicitation Format

6.3.4.  Probe Reply Format

   The TE uses the following encapsulation format for all probe reply
   packets (type 3):








Templin                  Expires March 28, 2008                 [Page 9]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


                          +--------------------------------+
                          |           Outer IPv4           |
                          |        Header w/'IPv4Id'       |
                          +--------------------------------+
                          |           * Headers            |
                          |                                |
   +-------------+        +--------------------------------+
   |   inner IP  |        |            inner IP            |
   ~    echo    ~  ===>  ~              echo               ~
   |    reply    |        |             reply              |
   +-------------+        +--------------------------------+ -\
     Inner Reply          |                                |  |
                          ~           Zero Padding         ~  |
                          |                                |  |
                          +--------------------------------+  |  T
                          |    YourPort    /    YourId     |  |  r
                          +--------------------------------+  |  a
                          |            YourAddr            |   > i
                          +--------------------------------+  |  l
                          |            ReassTime           |  |  e
                          +--------------------------------+  |  r
                          |          MaxInnerPktLen        |  |  s
                          +--------------------------------+  |
                          |   Footer (see: Section 6.3.1)  |  |
                          +--------------------------------+ -/
                          | Any */IPv4 protocol trailers ...
                          +------------------------------
                              Outer Reply with Trailers

                       Figure 4: Probe Reply Format

   where the following 14-byte "control block" information is included
   immediately following the padding and immediately before the trailing
   footer:

      YourPort (16 bits) - 1's complement of the observed port number of
      the probe request.

      YourId (16 bits) - 1's complement of the observed ip_id number of
      the probe request.

      YourAddr (32 bits) - 1's complement of the observed IPv4 source
      address of the probe request.

      ReassTime (32 bits) - non-zero value between 1 - (2^32-1) in 4usec
      increments.





Templin                  Expires March 28, 2008                [Page 10]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


      MaxInnerPktLen (32 bits) - non-zero value between 576 - (2^32-1)
      in 1 byte increments.

6.4.  Outer Packet Fragmentation

   For packets other than probe requests used for 'MaxOuterPktLen'
   determination, TEs use IPv4 fragmentation to fragment outer packets
   after IPv4 encapsulation into fragments no larger than
   'MaxOuterPktLen' bytes.  These outer fragments will be reassembled by
   the TFE.

6.5.  Setting DF in the Outer Header

   TEs MUST set DF=1 in the outer IPv4 header of probe requests to be
   used for 'MaxOuterPktLen' determination.  TEs MAY set DF=0 in the
   outer header of other probe requests and SHOULD set DF=0 in the outer
   header of probe replies.

   TEs MUST set DF=1 in the outer header of ordinary data packets/
   fragments when 'isNAT' is TRUE.

   TEs MAY set DF=0 in the outer header of ordinary data packets/
   fragments when 'isNAT' is FALSE.

6.6.  Window Management

   TEs send packets into a tunnel according to a window based on the
   TFE's advertised 'ReassTime'.  In particular, the TE must not admit
   more than 2^16 packets into the tunnel within the 'ReassTime' window.

   TE implementations should use discretion when not all of the inner-
   and outer fragments of the original packet could be admitted into the
   tunnel within the current window, i.e., implementations are advised
   to determine when it is appropriate to admit some fragments vs. drop
   all fragments.


7.  Receiving Packets

7.1.  Decapsulation

   TEs decapsulate each outer packet they receive exactly as specified
   in the appropriate IP/*/IPv4 document except that when 'isQualified'
   is TRUE and the packet includes a non-zero trailing checksum the TE
   first verifies the checksum in the outer packet as specified in
   Section 9.  If the A and B results of the checksum calculation match
   the values stored in the trailing checksum, the TE decapsulates the
   packet; otherwise it drops the packet.



Templin                  Expires March 28, 2008                [Page 11]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


   Note that the initial probe request/reply packets from a new TFE will
   be received before 'isQualified' is set to TRUE.  The TE decapsulates
   these packets also as specified in Section 8.

7.2.  Receiving Packet Too Big (PTB) Errors

   TEs may receive ICMPv4 PTB errors with Type=3 ("Destination
   Unreachable") and Code=4 ("fragmentation needed, and DF set") that
   include a Next-Hop MTU value [RFC1191] in response to any packets
   that were admitted into the tunnel with DF=1 [RFC0792].

   When a TE receives an ICMPv4 PTB with a Next-Hop MTU value smaller
   than 'MaxOuterPktLen', it SHOULD reduce 'MaxOuterPktLen' and/or
   actively probe to discover and confirm a new 'MaxOuterPktLen'.  The
   TE SHOULD NOT send a translated PTB back to the inner source.


8.  Tunnel Qualification and Soft State Management

   TEs engage in a probing process to qualify new TFEs and refresh per-
   TFE soft state for qualified TFEs thereafter.  TEs discontinue the
   probing process and garbage-collect stale soft state for dormant
   tunnels and unqualified TFEs.  TEs exchange probe requests, probe
   solicitations and probe replies as specified in the following
   sections:

8.1.  Probe Requests

   TEs send and receive probe requests as specified below:

8.1.1.  Sending Probe Requests

8.1.1.1.  Basic Probing Strategy

   TEs send probe requests while data is actively flowing through the
   tunnel.  The TE sends initial probe requests to qualify each new TFE,
   then sends periodic probe requests thereafter.  The TE SHOULD limit
   the rate at which it sends probe requests to each TFE, but MUST probe
   frequently enough to refresh per-TFE conceptual variables.

   The TE retains a cache of recently-sent probe requests and uses them
   to verify subsequent probe replies.

8.1.1.2.  MaxOuterPktLen Probing

   The TE SHOULD probe to detect larger 'MaxOuterPktLen' values by
   sending progressively larger probe requests padded to the desired
   probe size.  When the TE receives sufficient evidence through probing



Templin                  Expires March 28, 2008                [Page 12]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


   that the forward path to the TFE supports the probed size, it
   advances 'MaxOuterPktLen' to the probe size.  The TE SHOULD NOT send
   probe requests larger than ('MaxInnerPktLen' + ENCAPS).  The TE MAY
   send a series of probes in parallel to mitigate 'MaxOuterPktLen'
   fluctuations in the case of multipath routes with diverse path MTUs.

8.1.1.3.  Generating and Sending Probe Requests

   TEs generate probe requests by creating a minimum-sized and
   unfragmentable IP echo request packet according to the inner IP
   protocol (e.g., an ICMPv6 echo request [RFC4443] when the inner IP
   protocol is IPv6).  The echo request MUST include source and
   destination addresses that correspond to the TNE and TFE
   respectively, and SHOULD include additional identifying information
   (e.g., sequence/identification numbers, nonce values, etc.) that the
   TFE will echo in its reply.  The TE then encapsulates the echo
   request with padding added to create an outer probe request of the
   desired probe size and sends the probe request into the tunnel as
   specified in Section 6.

8.1.2.  Receiving Probe Requests

   When a TE receives a potential probe request from a TFE (i.e., as-
   told by examining the potential trailing footer), it first determines
   whether the packet includes a valid trailing checksum.  If the packet
   did not include a valid trailing checksum, the TE discontinues probe
   request processing, decapsulates the packet as for ordinary data and
   returns from processing.  Otherwise, the TE generates a probe reply
   as specified in Section 8.3.

8.2.  Probe Solicitations

   TEs send and receive probe solicitations as specified below:

8.2.1.  Sending Probe Solicitations

   When a TE has new information to convey to a TFE, but has not
   received recent probe requests from the TFE, it MAY send a probe
   solicitation to the TFE.  The TE creates a NULL inner IP packet
   (e.g., an IPv6 header with "No Next Header" in the Next Header field)
   with source and destination addresses that correspond to the TNE and
   TFE respectively.  The TE then encapsulates the NULL packet as a
   probe solicitation and sends it into the tunnel as specified in
   Section 6.







Templin                  Expires March 28, 2008                [Page 13]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


8.2.2.  Receiving Probe Solicitations

   When a TE receives a potential probe solicitation from a TFE, it
   first determines whether the packet includes a valid trailing
   checksum.  If the packet did not include a valid trailing checksum,
   the TE discontinues probe solicitation processing, decapsulates the
   inner packet as for ordinary data and returns from processing.

   Otherwise, the TE SHOULD send an expedited probe request with DF=0 to
   the TFE as specified in Section 8.1 if it has not successfully probed
   the TFE recently.  The TE then discards the probe solicitation.

8.3.  Probe Replies

   TEs send and receive probe replies as specified below:

8.3.1.  Sending Probe Replies

   TEs send probe replies in response to valid probe requests and use
   them as a mechanism for advertising 'MaxInnerPktLen' and 'ReassTime'
   values to the TFE.  TEs also use probe replies to inform the TFE of
   the IPv4 address and protocol port number that it observed in the
   TFE's probe request.

   The TE creates an inner IP echo reply packet according to the inner
   IP protocol (e.g., an ICMPv6 echo reply [RFC4443] when the inner
   protocol is IPv6).  The TE includes in the echo reply the destination
   address of the echo request as the source address and the source
   address of the echo request as the destination addresses.  The TE
   also includes in the echo reply any additional identifying
   information that the TFE included in its echo request.

   The TE then encapsulates the echo reply as specified in Section 6.3.
   For IP/*/IPv4 tunneling mechanisms that include a port number in the
   encapsulating * header, the TE includes the 1's complement of the
   protocol source port number it observed in the TFE's probe request
   (e.g., the UDP source port number for IPv6/UDP/IPv4 encapsulation) in
   the 16-bit 'YourPort' field.  (Otherwise, the TE encodes the value
   '0' in the 'YourPort' field.)  The TE next includes the 1's
   complement of the ip_id it observed in the outer IPv4 header of the
   TFE's probe request in the 16-bit 'YourId' field and encodes the
   source address of the probe request in the 32-bit 'YourAddr' field.

   The TE next includes a value that is less than or equal to an MRU
   appropriate for the interface the TFE's probe request arrived on in
   the 'MaxInnerPktLen' field.  The TE MAY choose to dynamically
   increase or decrease the 'MaxInnerPktLen' values it advertises to a
   TFE in successive probe replies, but if so it SHOULD seek to converge



Templin                  Expires March 28, 2008                [Page 14]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


   to a stable value.

   The TE finally includes a reassembly timeout value appropriate for
   the interface the TFE's probe request arrived on in the 'ReassTime'
   field.  The TE MAY choose to dynamically increase or decrease the
   'ReassTime' value it advertises to a TFE in successive probe replies,
   but if so it SHOULD seek to converge to a stable value.

   Following the encoding of the above trailing data, the TE appends the
   trailing checksum and sends the reply to the TFE.

8.3.2.  Receiving Probe Replies

8.3.2.1.  Probe Reply Verification

   When a TE receives a potential probe reply from a TFE, it first
   determines whether the packet includes a valid trailing checksum.
   The TE next verifies that the packet includes enough trailing data to
   contain a probe reply control block (see: Section 6.3.4) then
   examines the 'MaxInnerPktLen' and 'ReassTime' values in the potential
   control block.  If the packet did not include a valid trailing
   checksum, or the packet did not include a control block, or if either
   of the 'MaxInnerPktLen' or 'ReassTime' values in the potential
   control block lie outside of the acceptable ranges listed in Section
   6.3.4, the TE discontinues probe reply processing, decapsulates the
   packet as for ordinary data and returns from processing.

   Next, the TE verifies that the inner IP echo reply matches one of its
   cached probe requests by examining the inner IP source and
   destination addresses as well as any other identifying information in
   the inner packet.  The TE sets: 'isQualified' to TRUE for this TFE if
   the probe reply is valid; otherwise, it discards the probe reply and
   returns from processing.  If the TE receives excessive invalid probe
   replies from a TFE, it resets 'isQualified' to FALSE and restores
   'MaxOuterPktLen' and 'MaxInnerPktLen' to default values.

8.3.2.2.  Probe Reply Processing

   For IP/*/IPv4 tunneling mechanisms that include port numbers in
   encapsulating * headers, the TE next examines the 'YourPort',
   'YourId' and 'YourAddr' values encoded in the packet.  If the values
   match the 1's complement of the probe request's protocol port, ip_id
   and IPv4 address, respectively, the TE sets 'isNAT' to FALSE;
   otherwise, it sets 'isNAT' to TRUE.  (For encapsulating * headers
   that do not include port numbers, the TE the ignores the 'YourPort'
   value in this check.)

   Next, the TE records the 'MaxInnerPktLen' and 'ReassTime' values in



Templin                  Expires March 28, 2008                [Page 15]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


   the corresponding conceptual variables for this TFE.  If the new
   'MaxInnerPktLen' is smaller than ('MaxOuterPktLen' - ENCAPS), the TE
   SHOULD reduce 'MaxOuterPktLen' to ('MaxInnerPktLen' + ENCAPS).  If
   the 'MaxInnerPktLen' and 'ReassTime' values fluctuate significantly
   between successive probe replies, the TE SHOULD record the most
   conservative values received (e.g., 16KB 'MaxInnerPktLen' instead of
   64KB, 90sec 'ReassTime' instead of 60sec, etc.).

   Following the above processing, the TE discards the probe reply.


9.  8-bit Fletcher Checksum Calculation

   The 8-bit Fletcher Checksum is discussed in [RFC1146][STONE1][STONE2]
   and is used by this specification to provide an integrity check with
   different properties than those used by common link layers and upper
   layer protocols.

   The TE calculates the 8-bit Fletcher checksum of the first 64 bytes
   of the inner packet beginning with the inner IP header according to
   the algorithm of [RFC1146], which is reproduced below with an
   additional rule for representing zero results:

        The 8-bit Fletcher Checksum Algorithm is calculated over a
        sequence of data octets (call them D[1] through D[N]) by
        maintaining 2 unsigned 1's-complement 8-bit accumulators A and B
        whose contents are initially zero, and performing the following
        loop where i ranges from 1 to N:

             A := A + D[i]
             B := B + A

        If, at the end of the loop, either or both of the A, B
        accumulators encode the value 0x0000, invert the value
        in the accumulator(s) to 0xffff.

   Note that faster algorithms are possible and may be used instead of
   the algorithm above; see: [RFC1146] for citations of alternate
   algorithms.


10.  Updated Specifications

   This document updates the following specifications:

   o  RFC2003 (IP-in-IP)





Templin                  Expires March 28, 2008                [Page 16]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


   o  RFC2529 (6over4)

   o  RFC2661 (L2TP)

   o  RFC2784 (GRE)

   o  RFC3056 (6to4)

   o  RFC3378 (ETHERIP)

   o  RFC3884 (IPSec Transport Mode for Dynamic Routing)

   o  RFC4023 (MPLS-in-IP)

   o  RFC4213 (Basic IPv6 Transition Mechanisms)

   o  RFC4214 (ISATAP)

   o  RFC4301 (IPSec)

   o  RFC4302 (AH)

   o  RFC4303 (ESP)

   o  RFC4380 (TEREDO)

   o  LISP

   o  others....


11.  IANA Considerations

   The IANA is instructed to create a registry for the Version and Type
   values that occur in the footers of encapsulated packets per Section
   6.3.1.


12.  Security Considerations

   A possible attack vector involves an off-path attacker sending probe
   requests and/or probe solicitations with spoofed source addresses.
   Legitimate probe requests and replies contain identifying information
   that is useful for defending against off-path attacks.

   Security considerations for specific IP/*/IPv4 tunneling mechanisms
   are given in the respective documents.




Templin                  Expires March 28, 2008                [Page 17]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


13.  Acknowledgments

   This work has benefited from discussions with Fred Baker, Iljitsch
   van Beijnum, Steve Casner, Gorry Fairhurst, John Heffner, Joe Macker,
   Matt Mathis, and Joe Touch.  Dan Romascanu mentioned the IEEE 802.3as
   extension of the Ethernet frame size to 2048 bytes.  Remi Denis-
   Courmont noted that trailers could be added using the innermost '*'
   protocol length field.


14.  References

14.1.  Normative References

   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791,
              September 1981.

   [RFC0792]  Postel, J., "Internet Control Message Protocol", STD 5,
              RFC 792, September 1981.

   [RFC1122]  Braden, R., "Requirements for Internet Hosts -
              Communication Layers", STD 3, RFC 1122, October 1989.

   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
              November 1990.

   [RFC1812]  Baker, F., "Requirements for IP Version 4 Routers",
              RFC 1812, June 1995.

   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
              (IPv6) Specification", RFC 2460, December 1998.

14.2.  Informative References

   [RFC0905]  International Organization for Standardization (ISO), "ISO
              Transport Protocol specification ISO DP 8073", RFC 905,
              April 1984.

   [RFC1146]  Zweig, J. and C. Partridge, "TCP alternate checksum
              options", RFC 1146, March 1990.

   [RFC1981]  McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
              for IP version 6", RFC 1981, August 1996.

   [RFC2923]  Lahey, K., "TCP Problems with Path MTU Discovery",
              RFC 2923, September 2000.

   [RFC3385]  Sheinwald, D., Satran, J., Thaler, P., and V. Cavanna,



Templin                  Expires March 28, 2008                [Page 18]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


              "Internet Protocol Small Computer System Interface (iSCSI)
              Cyclic Redundancy Check (CRC)/Checksum Considerations",
              RFC 3385, September 2002.

   [RFC3819]  Karn, P., Bormann, C., Fairhurst, G., Grossman, D.,
              Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L.
              Wood, "Advice for Internet Subnetwork Designers", BCP 89,
              RFC 3819, July 2004.

   [RFC4213]  Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms
              for IPv6 Hosts and Routers", RFC 4213, October 2005.

   [RFC4443]  Conta, A., Deering, S., and M. Gupta, "Internet Control
              Message Protocol (ICMPv6) for the Internet Protocol
              Version 6 (IPv6) Specification", RFC 4443, March 2006.

   [RFC4459]  Savola, P., "MTU and Fragmentation Issues with In-the-
              Network Tunneling", RFC 4459, April 2006.

   [RFC4821]  Mathis, M. and J. Heffner, "Packetization Layer Path MTU
              Discovery", RFC 4821, March 2007.

   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
              Errors at High Data Rates", RFC 4963, July 2007.

   [STONE1]   Stone, J., "Checksums in the Internet (Stanford Doctoral
              Dissertation)", August 2001.

   [STONE2]   Stone, J., Greenwald, M., Partridge, C., and J. Hughes,
              "Performance of Checksums and CRC's over Real Data, IEEE/
              ACM Transactions on Networking, Vol 6, No. 5",
              October 1998.


Appendix A.  Discussion

   Probing strategies for packetization layer protocols are specified in
   ([RFC4821], Section 7) and apply also to the TE's 'MaxOuterPktLen'
   probing process.

   Further strategies for handling ICMPv4 PTB errors are specified in
   ([RFC4821], Section 7) and apply also to the TE's 'MaxOuterPktLen'
   probing process.

   Note that decapsulation automatically erases any padding that may
   have been inserted by the TE along with the trailing checksum.





Templin                  Expires March 28, 2008                [Page 19]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


Author's Address

   Fred L. Templin (editor)
   Boeing Phantom Works
   P.O. Box 3707
   Seattle, WA  98124
   USA

   Email: fred.l.templin@boeing.com










































Templin                  Expires March 28, 2008                [Page 20]


Internet-Draft            PLPMTUD  for Tunnels            September 2007


Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).





Templin                  Expires March 28, 2008                [Page 21]