Intarea Working Group                                          R. Bonica
Internet-Draft                                          Juniper Networks
Intended status: Best Current Practice                      May 29, 2013
Expires: November 30, 2013


       Generic Routing Encapsulation (GRE) Fragmentation Strategy
                    draft-bonica-intarea-gre-mtu-00

Abstract

   This memo documents a GRE fragmentation strategy upon which many
   vendors have converged.  Specifically, it defines procedures to be
   executed by GRE ingress routers.  It is published so that those
   building new implementations will be aware of best common practice.
   It is also published so that those building applications over GRE
   will understand how GRE works.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on November 30, 2013.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents



Bonica                 Expires November 30, 2013                [Page 1]


Internet-Draft             GRE Fragmentation                    May 2013


   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Problem Statement . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  How To Use This Document  . . . . . . . . . . . . . . . .   3
     1.2.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Design Goals  . . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  Common Procedures . . . . . . . . . . . . . . . . . . . . . .   4
     3.1.  General . . . . . . . . . . . . . . . . . . . . . . . . .   5
     3.2.  Tunnel MTU (TMTU) Discovery . . . . . . . . . . . . . . .   5
   4.  Procedures Affecting The GRE Deliver Header . . . . . . . . .   5
     4.1.  Tunneling GRE Over IPv4 . . . . . . . . . . . . . . . . .   5
     4.2.  Tunneling GRE Over IPv6 . . . . . . . . . . . . . . . . .   6
   5.  Procedures Affecting the GRE Payoad . . . . . . . . . . . . .   6
     5.1.  IPv4 Payloads . . . . . . . . . . . . . . . . . . . . . .   6
     5.2.  IPv6 Payloads . . . . . . . . . . . . . . . . . . . . . .   6
     5.3.  MPLS Payloads . . . . . . . . . . . . . . . . . . . . . .   6
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   7.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
     7.1.  VPN Considerations  . . . . . . . . . . . . . . . . . . .   7
     7.2.  Attacks Against PMTU Discovery  . . . . . . . . . . . . .   7
   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   7
   9.  Normative References  . . . . . . . . . . . . . . . . . . . .   7
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   8

1.  Problem Statement

   Generic Routing Encapsulation (GRE) [RFC2784] can be used to carry
   any network layer protocol over any network layer protocol.  GRE has
   been implemented by many vendors and is widely deployed on the
   Internet.

   [RFC2784], by design, does not describe procedures that affect
   fragmentation.  Lacking guidance from the specification, vendors have
   developed implementation-specific fragmentation strategies.  For the
   most part, devices implementing one fragmentation strategy
   interoperate with devices that implement another fragmentation
   strategy.

   However, implementors and network operators have discovered that some
   fragmentation strategies work better than others.  A poorly chosen



Bonica                 Expires November 30, 2013                [Page 2]


Internet-Draft             GRE Fragmentation                    May 2013


   fragmentation strategy can cause operational issues, including black-
   holing, packet reassembly on GRE egress routers and unexpected
   interactions with Path MTU Discovery [RFC1191] [RFC1981].

   This memo documents a GRE fragmentation strategy upon which many
   vendors have converged.  Specifically, it defines procedures to be
   executed by GRE ingress routers.  It is published so that those
   building new implementations will be aware of best common practice.
   It is also published so that those building applications over GRE
   will understand how GRE works.

   This memo specifies requirements beyond those stated in [RFC2784].
   However, it does not update [RFC2784].  Therefore, a GRE
   implementation can be compliant with [RFC2784] without satisfying the
   requirements of this memo.

1.1.  How To Use This Document

   This memo is presented in sections.  Section 2 enumerates design
   goals.  Section 3 defines procedures that all GRE ingress routers
   must execute.

   Section 4 defines procedures affecting generation of the GRE delivery
   header.  It is divided into two subsections.  Section 4.1 is
   applicable when GRE is tunneled over IPv4[RFC0791] and Section 4.2 is
   applicable when GRE is tunneled over IPv6 [RFC2460].

   Section 5 defines procedures for handling payloads that are so large
   that they cannot be forwarded through the GRE tunnel without
   fragmentation.  Section 5.1 is applicable when the payload is IPv4,
   Section 5.2 is applicable when the payload is IPv6 and Section 5.3 is
   applicable with the payload is MPLS.

   Section 6 discusses IANA considerations and Section 7 discusses
   security considerations.

1.2.  Terminology

   The following terms are specific to GRE and are taken from [RFC2784]:

   o  GRE delivery header - an IPv4 or IPv6 header whose source address
      is that of the GRE tunnel ingress and whose destination address is
      that of the GRE tunnel egress.  The GRE delivery header
      encapsulates a GRE header.

   o  GRE header - the GRE protocol header.  The GRE header is
      encapsulated in the GRE delivery header and encapsulates GRE
      payload.



Bonica                 Expires November 30, 2013                [Page 3]


Internet-Draft             GRE Fragmentation                    May 2013


   o  GRE payload - a network layer packet that is encapsulated by the
      GRE header.  The GRE payload can be IPv4, IPv6 or MPLS.
      Procedures for encapsulating IPv4 and IPv6 in GRE are described in
      [RFC2784].  Procedures for encapsulating MPLS in GRE are described
      in [RFC4023].

   o  GRE payload header - the IPv4, IPv6 or MPLS header of the GRE
      payload

   o  GRE overhead - the combined size of the GRE delivery header and
      the GRE header, measured in octets

   The following terms are specific MTU discovery:

   o  link MTU (LMTU) - the maximum transmission unit, i.e., maximum
      packet size in octets, that can be conveyed over a link without
      fragmentation

   o  path MTU (PMTU) - the minimum LMTU of all the links in a path
      between a source node and a destination node

   o  tunnel MTU (TMTU) - the maximum transmission unit, i.e., maximum
      packet size in octets, that can be conveyed over a GRE tunnel
      without fragmentation.  The TMTU is equal to the PMTU associated
      with the path between the tunnel ingress and the tunnel egress,
      minus the GRE overhead

2.  Design Goals

   The following is an ordered list of design goals for this
   specification:

   1.  Avoid black-holing

   2.  Avoid fragmentation

   3.  If fragmentation cannot be avoided, avoid fragmentation
       procedures that require reassemby on the GRE egress router.

   As an alternative to fragmentation, the procedures described herein
   rely on PMTU Discovery at the payload source.  Therefore, the
   procedures described herein cause the GRE ingress router to provide
   the payload source with all ICMP feedback required for PMTU
   Discovery.

3.  Common Procedures





Bonica                 Expires November 30, 2013                [Page 4]


Internet-Draft             GRE Fragmentation                    May 2013


   This section defines procedures that all GRE ingress routers must
   execute.

3.1.  General

   Implementations MUST satisfy all of the requirements stated in
   [RFC2784].

3.2.  Tunnel MTU (TMTU) Discovery

   Implementations MUST maintain a local data structure that reflects
   the TMTU of each GRE tunnel that originates on the node.  The TMTU
   MUST be equal to the PMTU associated with the path between the tunnel
   ingress and the tunnel egress, minus the GRE overhead.

   By default, implementations MUST discover the PMTU associated with
   the path between the tunnel ingress and the tunnel egress.  PMTU
   discovery procedures defined in [RFC1191] and [RFC1981] and will
   never permit the PMTU to exceed the LMTU associated with the first IP
   hop in the path to the tunnel egress.

   However, implementations MUST include a configuration option that
   disables PMTU Discovery for GRE tunnels.  This configuration option
   may be required to mitigate certain denial of service attacks (see
   Section 7).  When PMTU discovery for GRE tunnels is disabled, the
   TMTU for a tunnel MUST default to the LMTU associated with the first
   IP hop in the path to the tunnel egress, minus the GRE overhead.
   However, implementations MAY include a configuration option through
   which the TMTU can be set to another value, which is likely to be
   lower.

4.  Procedures Affecting The GRE Deliver Header

   This section defines procedures that GRE ingress routers execute
   while generating the GRE delivery header.

4.1.  Tunneling GRE Over IPv4

   When the GRE ingress router tunnels an IPv4 payload over IPv4, and
   the DF Bit in the payload header is set to 1 (Don't Fragment), the
   GRE ingress router MUST set the DF bit in the delivery header to 1.

   When the GRE ingress router tunnels an IPv4 payload over IPv4, and
   the DF Bit in the payload header is set to 0 (May Fragment), by
   default, the GRE ingress router MUST set the DF bit in the delivery
   header to 1.  However, implementations MAY include a configuration
   option that allows the DF bit to be copied from the payload header to
   the delivery header.



Bonica                 Expires November 30, 2013                [Page 5]


Internet-Draft             GRE Fragmentation                    May 2013


   When the GRE ingress router tunnels an IPv6 payload over IPv4, the
   GRE ingress router MUST set the DF bit in the delivery header to 1.

   The GRE ingress router MUST NOT emit a delivery header in which the
   MF bit is set to 1 (More Fragments).

4.2.  Tunneling GRE Over IPv6

   The GRE ingress router MUST NOT emit a delivery header containing a
   fragment header.

5.  Procedures Affecting the GRE Payoad

   This section defines procedures that GRE ingress routers execute when
   they receive a packet a) whose next-hop is a GRE tunnel and b) whose
   size is greater than the TMTU associated with that tunnel.

5.1.  IPv4 Payloads

   If the DF bit in the payload header is set to 1 (Don't Fragment), the
   GRE ingress router MUST discard the packet and sent an ICMPv4
   [RFC0792] Destination Unreachable message to the payload source, with
   type equal to 4 (fragmentation needed and DF set).  The ICMP
   Destination Unreachable message MUST contain an Next-hop MTU (as
   specified by [RFC1191]) and the next-hop MTU MUST be equal to the
   TMTU associated with the tunnel.

   If the DF bit in the payload header is set to 0 (May Fragment), the
   GRE ingress router MUST fragment the payload and submit each fragment
   to GRE tunnel.  Therefore, the GRE egress router will receive
   complete, non-fragmented packets, containing fragmented payloads.
   The GRE egress router will forward the payload fragments to their
   ultimate destination where they will be reassembled.

5.2.  IPv6 Payloads

   The GRE ingress router MUST discard the packet and send an ICMPv6
   [RFC4443] Packet Too Big message to the payload source.  The MTU
   specified in the Packet Too Big message MUST be equal to the TMTU
   associated with the tunnel.

5.3.  MPLS Payloads

   The GRE ingress router MUST discard the packet.  As it is impossible
   to reliably identify the payload source, the GRE ingress router MUST
   NOT attempt to send an ICMPv4 Destination Unreachable message or an
   ICMPv6 Packet Too Big message to the payload source.




Bonica                 Expires November 30, 2013                [Page 6]


Internet-Draft             GRE Fragmentation                    May 2013


6.  IANA Considerations

   This document makes no request of IANA.

7.  Security Considerations

7.1.  VPN Considerations

   [RFC4364] introduces the concept of a Virtual Routing and Forwarding
   Table (VRF).  When a GRE ingress router forwards an ICMP message to
   the payload source, it MUST forward that message using the
   appropriate VRF.  Failure to do so would a) cause information to leak
   between VRFs and b) prevent the ICMP message from reaching its
   intended destination.

   Specifically, the GRE ingress router MUST forward the ICMP message
   using the VRF that is associated with the interface upon which the
   payload arrived.

7.2.  Attacks Against PMTU Discovery

   PMTU Discovery is vulnerable to two denial of service attacks (see
   Section 8 of [RFC1191] for details).  Both attacks are based upon on
   a malicious party sending forged ICMPv4 Destination Unreachable or
   ICMPv6 Packet Too Big messages to a host.  In the first attack, the
   forged message indicates an inordinately small PMTU.  In the second
   attack, the forged message indicates an inordinately large MTU.  In
   both cases, throughput is adversely affected.  On order to mitigate
   such attacks, GRE implementations MUST include a configuration option
   to disable PMTU discovery on GRE tunnels.

8.  Acknowledgements

   The authors would like to thank John Scudder, Jeff Haas and Jagadish
   Grandhi for their constructive comments.

9.  Normative References

   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791, September
              1981.

   [RFC0792]  Postel, J., "Internet Control Message Protocol", STD 5,
              RFC 792, September 1981.

   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
              November 1990.





Bonica                 Expires November 30, 2013                [Page 7]


Internet-Draft             GRE Fragmentation                    May 2013


   [RFC1981]  McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
              for IP version 6", RFC 1981, August 1996.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2460]  Deering, S.E. and R.M. Hinden, "Internet Protocol, Version
              6 (IPv6) Specification", RFC 2460, December 1998.

   [RFC2784]  Farinacci, D., Li, T., Hanks, S., Meyer, D., and P.
              Traina, "Generic Routing Encapsulation (GRE)", RFC 2784,
              March 2000.

   [RFC4023]  Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating
              MPLS in IP or Generic Routing Encapsulation (GRE)", RFC
              4023, March 2005.

   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
              Networks (VPNs)", RFC 4364, February 2006.

   [RFC4443]  Conta, A., Deering, S., and M. Gupta, "Internet Control
              Message Protocol (ICMPv6) for the Internet Protocol
              Version 6 (IPv6) Specification", RFC 4443, March 2006.

Author's Address

   Ron Bonica
   Juniper Networks
   2251 Corporate Park Drive Herndon
   Herndon, Virginia  20170
   USA

   Email: rbonica@juniper.net

















Bonica                 Expires November 30, 2013                [Page 8]