6lo                                                      P. Thubert, Ed.
Internet-Draft                                             Cisco Systems
Updates: 4944 (if approved)                              23 October 2019
Intended status: Standards Track
Expires: 25 April 2020

                  6LoWPAN Selective Fragment Recovery


   This draft updates RFC 4944 with a simple protocol to recover
   individual fragments across a route-over mesh network, with a minimal
   flow control to protect the network against bloat.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 25 April 2020.

Copyright Notice

   Copyright (c) 2019 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Simplified BSD License text
   as described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Simplified BSD License.

Thubert                   Expires 25 April 2020                 [Page 1]

Internet-Draft               Selective RFRAG                October 2019

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.1.  BCP 14  . . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.2.  References  . . . . . . . . . . . . . . . . . . . . . . .   4
     2.3.  6LoWPAN Acronyms  . . . . . . . . . . . . . . . . . . . .   4
     2.4.  Referenced Work . . . . . . . . . . . . . . . . . . . . .   4
     2.5.  New Terms . . . . . . . . . . . . . . . . . . . . . . . .   5
   3.  Updating RFC 4944 . . . . . . . . . . . . . . . . . . . . . .   6
   4.  Extending draft-ietf-6lo-minimal-fragment . . . . . . . . . .   6
     4.1.  Slack in the First Fragment . . . . . . . . . . . . . . .   7
     4.2.  Gap between frames  . . . . . . . . . . . . . . . . . . .   7
     4.3.  Modifying the First Fragment  . . . . . . . . . . . . . .   7
   5.  New Dispatch types and headers  . . . . . . . . . . . . . . .   8
     5.1.  Recoverable Fragment Dispatch type and Header . . . . . .   8
     5.2.  RFRAG Acknowledgment Dispatch type and Header . . . . . .  11
   6.  Fragments Recovery  . . . . . . . . . . . . . . . . . . . . .  12
     6.1.  Forwarding Fragments  . . . . . . . . . . . . . . . . . .  14
       6.1.1.  Upon the first fragment . . . . . . . . . . . . . . .  14
       6.1.2.  Upon the next fragments . . . . . . . . . . . . . . .  15
     6.2.  Upon the RFRAG Acknowledgments  . . . . . . . . . . . . .  15
     6.3.  Aborting the Transmission of a Fragmented Packet  . . . .  16
   7.  Management Considerations . . . . . . . . . . . . . . . . . .  17
     7.1.  Protocol Parameters . . . . . . . . . . . . . . . . . . .  17
     7.2.  Observing the network . . . . . . . . . . . . . . . . . .  18
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  18
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  19
   10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  19
   11. Normative References  . . . . . . . . . . . . . . . . . . . .  19
   12. Informative References  . . . . . . . . . . . . . . . . . . .  20
   Appendix A.  Rationale  . . . . . . . . . . . . . . . . . . . . .  23
   Appendix B.  Requirements . . . . . . . . . . . . . . . . . . . .  24
   Appendix C.  Considerations On Flow Control . . . . . . . . . . .  25
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  26

1.  Introduction

   In most Low Power and Lossy Network (LLN) applications, the bulk of
   the traffic consists of small chunks of data (in the order few bytes
   to a few tens of bytes) at a time.  Given that an IEEE Std. 802.15.4
   [IEEE.802.15.4] frame can carry a payload of 74 bytes or more,
   fragmentation is usually not required.  However, and though this
   happens only occasionally, a number of mission critical applications
   do require the capability to transfer larger chunks of data, for
   instance to support the firmware upgrade of the LLN nodes or the
   extraction of logs from LLN nodes.  In the former case, the large
   chunk of data is transferred to the LLN node, whereas in the latter,

Thubert                   Expires 25 April 2020                 [Page 2]

Internet-Draft               Selective RFRAG                October 2019

   the large chunk flows away from the LLN node.  In both cases, the
   size can be on the order of 10 kilobytes or more and an end-to-end
   reliable transport is required.

   "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944]
   defines the original 6LoWPAN datagram fragmentation mechanism for
   LLNs.  One critical issue with this original design is that routing
   an IPv6 [RFC8200] packet across a route-over mesh requires to
   reassemble the full packet at each hop, which may cause latency along
   a path and an overall buffer bloat in the network.  The "6TiSCH
   Architecture" [I-D.ietf-6tisch-architecture] recommends to use a hop-
   by-hop fragment forwarding technique to alleviate those undesirable
   effects.  "LLN Minimal Fragment Forwarding"
   [I-D.ietf-6lo-minimal-fragment] proposes such a technique, in a
   fashion that is compatible with [RFC4944] without the need to define
   a new protocol.

   However, adding that capability alone to the local implementation of
   the original 6LoWPAN fragmentation would not address the inherent
   fragility of fragmentation (see [I-D.ietf-intarea-frag-fragile]) in
   particular the issues of resources locked on the receiver and the
   wasted transmissions due to the loss of a single fragment ina whole
   datagram.  [Kent] compares the unreliable delivery of fragments with
   a mechanism it calls "selective acknowledgements" that recovers the
   loss of a fragment individually.  The paper illustrates the benefits
   that can be derived from such a method in figures 1, 2 and 3, pages 6
   and 7.  [RFC4944] as no selective recovery and the whole datagram
   fails when one fragment is not delivered to the destination 6LoWPAN
   endpoint.  Constrained memory resources are blocked on the receiver
   until the receiver times out, possibly causing the loss of subsequent
   packets that can not be received for the lack of buffers.

   That problem is exacerbated when forwarding fragments over multiple
   hops since a loss at an intermediate hop will not be discovered by
   either the source or the destination, and the source will keep on
   sending fragments, wasting even more resources in the network and
   possibly contributing to the condition that caused the loss to no
   avail since the datagram cannot arrive in its entirety.  RFC 4944 is
   also missing signaling to abort a multi-fragment transmission at any
   time and from either end, and, if the capability to forward fragments
   is implemented, clean up the related state in the network.  It is
   also lacking flow control capabilities to avoid participating to a
   congestion that may in turn cause the loss of a fragment and
   potentially the retransmission of the full datagram.

   This specification provides a method to forward fragments across a
   multi-hop route-over mesh, and a selective acknowledgment to recover
   individual fragments between 6LoWPAN endpoints.

Thubert                   Expires 25 April 2020                 [Page 3]

Internet-Draft               Selective RFRAG                October 2019

   The method is designed to limit congestion loss in the network and
   addresses the requirements that are detailed in Appendix B.

2.  Terminology

2.1.  BCP 14

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119][RFC8174] when, and only when, they appear in all
   capitals, as shown here.

2.2.  References

   In this document, readers will encounter terms and concepts that are
   discussed in "Problem Statement and Requirements for IPv6 over
   Low-Power Wireless Personal Area Network (6LoWPAN) Routing" [RFC6606]

2.3.  6LoWPAN Acronyms

   This document uses the following acronyms:

   6BBR:  6LoWPAN Backbone Router
   6LBR:  6LoWPAN Border Router
   6LN:  6LoWPAN Node
   6LR:  6LoWPAN Router
   LLN:  Low-Power and Lossy Network

2.4.  Referenced Work

   Past experience with fragmentation has shown that misassociated or
   lost fragments can lead to poor network behavior and, occasionally,
   trouble at application layer.  The reader is encouraged to read "IPv4
   Reassembly Errors at High Data Rates" [RFC4963] and follow the
   references for more information.

   That experience led to the definition of "Path MTU discovery"
   [RFC8201] (PMTUD) protocol that limits fragmentation over the

   Specifically in the case of UDP, valuable additional information can
   be found in "UDP Usage Guidelines for Application Designers"

   Readers are expected to be familiar with all the terms and concepts
   that are discussed in "IPv6 over Low-Power Wireless Personal Area
   Networks (6LoWPANs): Overview, Assumptions, Problem Statement, and

Thubert                   Expires 25 April 2020                 [Page 4]

Internet-Draft               Selective RFRAG                October 2019

   Goals" [RFC4919] and "Transmission of IPv6 Packets over IEEE 802.15.4
   Networks" [RFC4944].

   "The Benefits of Using Explicit Congestion Notification (ECN)"
   [RFC8087] provides useful information on the potential benefits and
   pitfalls of using ECN.

   Quoting the "Multiprotocol Label Switching (MPLS) Architecture"
   [RFC3031]: with MPLS, 'packets are "labeled" before they are
   forwarded'.  At subsequent hops, there is no further analysis of the
   packet's network layer header.  Rather, the label is used as an index
   into a table which specifies the next hop, and a new label".  The
   MPLS technique is leveraged in the present specification to forward
   fragments that actually do not have a network layer header, since the
   fragmentation occurs below IP.

   "LLN Minimal Fragment Forwarding" [I-D.ietf-6lo-minimal-fragment]
   introduces the concept of a Virtual Reassembly Buffer (VRB) and an
   associated technique to forward fragments as they come, using the
   datagram_tag as a label in a fashion similar to MPLS.  This
   specification reuses that technique with slightly modified controls.

2.5.  New Terms

   This specification uses the following terms:

   6LoWPAN endpoints:  The LLN nodes in charge of generating or
      expanding a 6LoWPAN header from/to a full IPv6 packet.  The
      6LoWPAN endpoints are the points where fragmentation and
      reassembly take place.

   Compressed Form:  This specification uses the generic term Compressed
      Form to refer to the format of a datagram after the action of
      [RFC6282] and possibly [RFC8138] for RPL [RFC6550] artifacts.

   datagram_size:  The size of the datagram in its Compressed Form
      before it is fragmented.  The datagram_size is expressed in a unit
      that depends on the MAC layer technology, by default a byte.

   datagram_tag:  An identifier of a datagram that is locally unique to
      the Layer-2 sender.  Associated with the MAC address of the
      sender, this becomes a globally unique identifier for the

   fragment_offset:  The offset of a particular fragment of a datagram
      in its Compressed Form.  The fragment_offset is expressed in a
      unit that depends on the MAC layer technology and is by default a

Thubert                   Expires 25 April 2020                 [Page 5]

Internet-Draft               Selective RFRAG                October 2019

   RFRAG:  Recoverable Fragment

   RFRAG-ACK:  Recoverable Fragment Acknowledgement

   RFRAG Acknowledgment Request:  An RFRAG with the Acknowledgement
      Request flag ('X' flag) set.

   NULL bitmap:  Refers to a bitmap with all bits set to zero.

   FULL bitmap:  Refers to a bitmap with all bits set to one.

   Forward:  The direction of a LSP path, followed by the RFRAG.

   Reverse:  The reverse direction of a LSP path, taken by the RFRAG-

3.  Updating RFC 4944

   This specification updates the fragmentation mechanism that is
   specified in "Transmission of IPv6 Packets over IEEE 802.15.4
   Networks" [RFC4944] for use in route-over LLNs by providing a model
   where fragments can be forwarded end-to-end across a 6LoWPAN LLN, and
   where fragments that are lost on the way can be recovered
   individually.  A new format for fragment is introduced and new
   dispatch types are defined in Section 5.

   [RFC8138] allows to modify the size of a packet en-route by removing
   the consumed hops in a compressed Routing Header.  It results that
   fragment_offset and datagram_size (see Section 2.5) must also be
   modified en-route, whcih is difficult to do in the uncompressed form.
   This specification expresses those fields in the Compressed Form and
   allows to modify them en-route (see Section 4.3) easily.

   Note that consistently with Section 2 of [RFC6282] for the
   fragmentation mechanism described in Section 5.3 of [RFC4944], any
   header that cannot fit within the first fragment MUST NOT be
   compressed when using the fragmentation mechanism described in this

4.  Extending draft-ietf-6lo-minimal-fragment

   This specification extends the fragment forwarding mechanism
   specified in "LLN Minimal Fragment Forwarding"
   [I-D.ietf-6lo-minimal-fragment] by providing additional operations to
   improve the management of the Virtual Reassembly Buffer (VRB) in the
   context of recoverable fragments.

Thubert                   Expires 25 April 2020                 [Page 6]

Internet-Draft               Selective RFRAG                October 2019

4.1.  Slack in the First Fragment

   At the time of this writing, [I-D.ietf-6lo-minimal-fragment] allows
   for refragmenting in intermediate nodes, meaning that some bytes from
   a given fragment may be left in the VRB to be added to the next
   fragment.  The reason for this to happen would be the need for space
   in the outgoing fragment that was not needed in the incoming
   fragment, for instance because the 6LoWPAN Header Compression is not
   as efficient on the outgoing link, e.g., if the Interface ID (IID) of
   the source IPv6 address is elided by the originator on the first hop
   because it matches the source MAC address, but cannot be on the next
   hops because the source MAC address changes.

   This specification cannot allow this operation since fragments are
   recovered end-to-end based on a sequence number.  This means that the
   fragments that contain a 6LoWPAN-compressed header MUST have enough
   slack to enable a less efficient compression in the next hops that
   still fits in one MAC frame.  For instance, if the IID of the source
   IPv6 address is elided by the originator, then it MUST compute the
   fragment_size as if the MTU was 8 bytes less.  This way, the next hop
   can restore the source IID to the first fragment without impacting
   the second fragment.

4.2.  Gap between frames

   This specification introduces a concept of Inter-Frame Gap, which is
   a configurable interval of time between transmissions to a same next
   hop.  In the case of half duplex interfaces, this InterFrameGap
   ensures that the next hop has progressed the previous frame and is
   capable of receiving the next one.

   In the case of a mesh operating at a single frequency with
   omnidirectional antennas, a larger InterFrameGap is required to
   protect the frame against hidden terminal collisions with the
   previous frame of a same flow that is still progressing along a
   common path.

   The Inter-Frame Gap is useful even for unfragmented datagrams, but it
   becomes a necessity for fragments that are typically generated in a
   fast sequence and are all sent over the exact same path.

4.3.  Modifying the First Fragment

   The compression of the Hop Limit, of the source and destination
   addresses in the IPv6 Header, and of the Routing Header, may change
   en-route in a Route-Over mesh LLN.  If the size of the first fragment
   is modified, then the intermediate node MUST adapt the datagram_size
   to reflect that difference.

Thubert                   Expires 25 April 2020                 [Page 7]

Internet-Draft               Selective RFRAG                October 2019

   The intermediate node MUST also save the difference of datagram_size
   of the first fragment in the VRB and add it to the datagram_size and
   to the fragment_offset of all the subsequent fragments for that

5.  New Dispatch types and headers

   This specification enables the 6LoWPAN fragmentation sublayer to
   provide an MTU up to 2048 bytes to the upper layer, which can be the
   6LoWPAN Header Compression sublayer that is defined in the
   "Compression Format for IPv6 Datagrams" [RFC6282] specification.  In
   order to achieve this, this specification enables the fragmentation
   and the reliable transmission of fragments over a multihop 6LoWPAN
   mesh network.

   This specification provides a technique that is derived from MPLS to
   forward individual fragments across a 6LoWPAN route-over mesh without
   reassembly at each hop.  The datagram_tag is used as a label; it is
   locally unique to the node that owns the source MAC address of the
   fragment, so together the MAC address and the label can identify the
   fragment globally.  A node may build the datagram_tag in its own
   locally-significant way, as long as the chosen datagram_tag stays
   unique to the particular datagram for the lifetime of that datagram.
   It results that the label does not need to be globally unique but
   also that it must be swapped at each hop as the source MAC address

   This specification extends RFC 4944 [RFC4944] with 2 new Dispatch
   types, for Recoverable Fragment (RFRAG) and for the RFRAG
   Acknowledgment back.  The new 6LoWPAN Dispatch types are taken from
   Page 0 [RFC8025] as indicated in Table 1 in Section 9.

   In the following sections, a "datagram_tag" extends the semantics
   defined in [RFC4944] Section 5.3."Fragmentation Type and Header".
   The datagram_tag is a locally unique identifier for the datagram from
   the perspective of the sender.  This means that the datagram_tag
   identifies a datagram uniquely in the network when associated with
   the source of the datagram.  As the datagram gets forwarded, the
   source changes and the datagram_tag must be swapped as detailed in

5.1.  Recoverable Fragment Dispatch type and Header

   In this specification, if the packet is compressed then the size and
   offset of the fragments are expressed on the Compressed Form of the
   packet form as opposed to the uncompressed - native - packet form.

Thubert                   Expires 25 April 2020                 [Page 8]

Internet-Draft               Selective RFRAG                October 2019

   The format of the fragment header is shown in Figure 1.  It is the
   same for all fragments.  The format has a length and an offset, as
   well as a sequence field.  This would be redundant if the offset was
   computed as the product of the sequence by the length, but this is
   not the case.  The position of a fragment in the reassembly buffer is
   neither correlated with the value of the sequence field nor with the
   order in which the fragments are received.  This enables out-of-
   sequence subfragmenting, e.g., a fragment seq. 5 that is retried end-
   to-end as smaller fragments seq. 5, 13 and 14 due to a change of MTU
   along the path between the 6LoWPAN endpoints.

                             1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
                                       |1 1 1 0 1 0 0|E|  datagram_tag |
       |X| sequence|   fragment_size   |       fragment_offset         |

                                                X set == Ack-Request

                  Figure 1: RFRAG Dispatch type and Header

   There is no requirement on the receiver to check for contiguity of
   the received fragments, and the sender MUST ensure that when all
   fragments are acknowledged, then the datagram is fully received.
   This may be useful in particular in the case where the MTU changes
   and a fragment sequence is retried with a smaller fragment_size, the
   remainder of the original fragment being retried with new sequence

   The first fragment is recognized by a sequence of 0; it carries its
   fragment_size and the datagram_size of the compressed packet before
   it is fragmented, whereas the other fragments carry their
   fragment_size and fragment_offset.  The last fragment for a datagram
   is recognized when its fragment_offset and its fragment_size add up
   to the datagram_size.

   Recoverable Fragments are sequenced and a bitmap is used in the RFRAG
   Acknowledgment to indicate the received fragments by setting the
   individual bits that correspond to their sequence.

   X:  1 bit; Ack-Request: when set, the sender requires an RFRAG
      Acknowledgment from the receiver.

   E:  1 bit; Explicit Congestion Notification; the "E" flag is reset by
      the source of the fragment and set by intermediate routers to
      signal that this fragment experienced congestion along its path.

Thubert                   Expires 25 April 2020                 [Page 9]

Internet-Draft               Selective RFRAG                October 2019

   Fragment_size:  10 bit unsigned integer; the size of this fragment in
      a unit that depends on the MAC layer technology.  Unless
      overridden by a more specific specification, that unit is the
      octet which allows fragments up to 512 bytes.

   datagram_tag:  16 bits; an identifier of the datagram that is locally
      unique to the sender.

   Sequence:  5 bit unsigned integer; the sequence number of the
      fragment in the acknowledgement bitmap.  Fragments are numbered
      [0..N] where N is in [0..31].  A Sequence of 0 indicates the first
      fragment in a datagram, but non-zero values are not indicative of
      the position in the reassembly buffer.

   Fragment_offset:  16 bit unsigned integer.

      When the Fragment_offset is set to a non-0 value, its semantics
      depend on the value of the Sequence field as follows:

      *  For a first fragment (i.e. with a Sequence of 0), this field
         indicates the datagram_size of the compressed datagram, to help
         the receiver allocate an adapted buffer for the reception and
         reassembly operations.  The fragment may be stored for local
         reassembly.  Alternatively, it may be routed based on the
         destination IPv6 address.  In that case, a VRB state must be
         installed as described in Section 6.1.1.
      *  When the Sequence is not 0, this field indicates the offset of
         the fragment in the Compressed Form of the datagram.  The
         fragment may be added to a local reassembly buffer or forwarded
         based on an existing VRB as described in Section 6.1.2.

      A Fragment_offset that is set to a value of 0 indicates an abort
      condition and all state regarding the datagram should be cleaned
      up once the processing of the fragment is complete; the processing
      of the fragment depends on whether there is a VRB already
      established for this datagram, and the next hop is still

      *  if a VRB already exists and is not broken, the fragment is to
         be forwarded along the associated Label Switched Path (LSP) as
         described in Section 6.1.2, but regardless of the value of the
         Sequence field;
      *  else, if the Sequence is 0, then the fragment is to be routed
         as described in Section 6.1.1 but no state is conserved
         afterwards.  In that case, the session if it exists is aborted
         and the packet is also forwarded in an attempt to clean up the
         next hops as along the path indicated by the IPv6 header
         (possibly including a routing header).

Thubert                   Expires 25 April 2020                [Page 10]

Internet-Draft               Selective RFRAG                October 2019

      If the fragment cannot be forwarded or routed, then an abort
      RFRAG-ACK is sent back to the source as described in
      Section 6.1.2.

5.2.  RFRAG Acknowledgment Dispatch type and Header

   This specification also defines a 4-octet RFRAG Acknowledgment bitmap
   that is used by the reassembling endpoint to confirm selectively the
   reception of individual fragments.  A given offset in the bitmap maps
   one to one with a given sequence number and indicates which fragment
   is acknowledged as follows:

                            1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       |           RFRAG Acknowledgment Bitmap                         |
        ^                 ^
        |                 |    bitmap indicating whether:
        |                 +----- Fragment with sequence 9 was received
        +----------------------- Fragment with sequence 0 was received

               Figure 2: RFRAG Acknowledgment bitmap encoding

   Figure 3 shows an example Acknowledgment bitmap which indicates that
   all fragments from sequence 0 to 20 were received, except for
   fragments 1, 2 and 16 that were lost and must be retried.

                            1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

               Figure 3: Example RFRAG Acknowledgment Bitmap

   The RFRAG Acknowledgment Bitmap is included in a RFRAG Acknowledgment
   header, as follows:

                            1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
                                       |1 1 1 0 1 0 1|E|  datagram_tag |
       |          RFRAG Acknowledgment Bitmap (32 bits)                |

Thubert                   Expires 25 April 2020                [Page 11]

Internet-Draft               Selective RFRAG                October 2019

          Figure 4: RFRAG Acknowledgment Dispatch type and Header

   E:  1 bit; Explicit Congestion Notification Echo

      When set, the sender indicates that at least one of the
      acknowledged fragments was received with an Explicit Congestion
      Notification, indicating that the path followed by the fragments
      is subject to congestion.  More in Appendix C.

   RFRAG Acknowledgment Bitmap:  An RFRAG Acknowledgment Bitmap, whereby
      setting the bit at offset x indicates that fragment x was
      received, as shown in Figure 2.  A NULL bitmap that indicates that
      the fragmentation process is aborted.  A FULL bitmap that
      indicates that the fragmentation process is complete, all
      fragments were received at the reassembly endpoint.

6.  Fragments Recovery

   The Recoverable Fragment header RFRAG is used to transport a fragment
   and optionally request an RFRAG Acknowledgment that will confirm the
   good reception of one or more fragments.  An RFRAG Acknowledgment is
   carried as a standalone fragment header (i.e.  with no 6LoWPAN
   payload) in a message that is propagated back to the 6LoWPAN endpoint
   that was the originator of the fragments.  To achieve this, each hop
   that performed an MPLS-like operation on fragments reverses that
   operation for the RFRAG_ACK by sending a frame from the next hop to
   the previous hop as known by its MAC address in the VRB.  The
   datagram_tag in the RFRAG_ACK is unique to the receiver and is enough
   information for an intermediate hop to locate the VRB that contains
   the datagram_tag used by the previous hop and the Layer-2 information
   associated to it (interface and MAC address).

   The 6LoWPAN endpoint that fragments the packets at 6LoWPAN level (the
   sender) also controls the amount of acknowledgments by setting the
   Ack-Request flag in the RFRAG packets.  The sender may set the Ack-
   Request flag on any fragment to perform congestion control by
   limiting the number of outstanding fragments, which are the fragments
   that have been sent but for which reception or loss was not
   positively confirmed by the reassembling endpoint.  The maximum
   number of outstanding fragments is the Window-Size.  It is
   configurable and may vary in case of ECN notification.  When the
   6LoWPAN endpoint that reassembles the packets at 6LoWPAN level (the
   receiver) receives a fragment with the Ack-Request flag set, it MUST
   send an RFRAG Acknowledgment back to the originator to confirm
   reception of all the fragments it has received so far.

   The Ack-Request ('X') set in an RFRAG marks the end of a window.
   This flag MUST be set on the last fragment if the sender wishes to

Thubert                   Expires 25 April 2020                [Page 12]

Internet-Draft               Selective RFRAG                October 2019

   protect the datagram, and it MAY be set in any intermediate fragment
   for the purpose of flow control.  This ARQ process MUST be protected
   by a timer, and the fragment that carries the 'X' flag MAY be retried
   upon time out a configurable amount of times (see Section 7.1).  Upon
   exhaustion of the retries the sender may either abort the
   transmission of the datagram or retry the datagram from the first
   fragment with an 'X' flag set in order to reestablish a path and
   discover which fragments were received over the old path in the
   acknowledgment bitmap.  When the sender of the fragment knows that an
   underlying link-layer mechanism protects the fragments, it may
   refrain from using the RFRAG Acknowledgment mechanism, and never set
   the Ack-Request bit.

   The RFRAG Acknowledgment can optionally carry an ECN indication for
   flow control (see Appendix C).  The receiver of a fragment with the
   'E' (ECN) flag set MUST echo that information by setting the 'E'
   (ECN) flag in the next RFRAG Acknowledgment.

   In order to protect the datagram, the sender transfers a controlled
   number of fragments and flags the last fragment of a window with an
   RFRAG Acknowledgment Request.  The receiver MUST acknowledge a
   fragment with the acknowledgment request bit set.  If any fragment
   immediately preceeding an acknowledgment request is still missing,
   the receiver MAY intentionally delay its acknowledgment to allow in-
   transit fragments to arrive.  Because it might defeat the round trip
   delay computation, delaying the acknowledgment should be configurable
   and not enabled by default.

   The receiver MAY issue unsolicited acknowledgments.  An unsolicited
   acknowledgment signals to the sender endpoint that it can resume
   sending if it had reached its maximum number of outstanding
   fragments.  Another use is to inform that the reassembling endpoint
   aborted the process of an individual datagram.

   When all the fragments are received, the receiving endpoint
   reconstructs the packet, passes it to the upper layer, sends a RFRAG
   Acknowledgment on the reverse path with a FULL bitmap, and harms a
   short timer to absorb packets that are still in flight for that
   datagram without creating a new state and abort the communication if
   it keeps going on.

   Note that acknowledgments might consume precious resources so the use
   of unsolicited acknowledgments should be configurable and not enabled
   by default.

   An observation is that streamlining forwarding of fragments generally
   reduces the latency over the LLN mesh, providing room for retries
   within existing upper-layer reliability mechanisms.  The sender

Thubert                   Expires 25 April 2020                [Page 13]

Internet-Draft               Selective RFRAG                October 2019

   protects the transmission over the LLN mesh with a retry timer that
   is computed according to the method detailed in [RFC6298].  It is
   expected that the upper layer retries obey the recommendations in
   "UDP Usage Guidelines" [RFC8085], in which case a single round of
   fragment recovery should fit within the upper layer recovery timers.

   Fragments are sent in a round robin fashion: the sender sends all the
   fragments for a first time before it retries any lost fragment; lost
   fragments are retried in sequence, oldest first.  This mechanism
   enables the receiver to acknowledge fragments that were delayed in
   the network before they are retried.

   When a single frequency is used by contiguous hops, the sender should
   wait a reasonable amount of time between fragments so as to let a
   fragment progress a few hops and avoid hidden terminal issues.  This
   precaution is not required on channel hopping technologies such as
   Time Slotted Channel Hopping (TSCH) [RFC6554], where nodes that
   communicate at Layer-2 are scheduled to send and receive
   respectively, and different hops operate on different channels.

6.1.  Forwarding Fragments

   It is assumed that the first Fragment is large enough to carry the
   IPv6 header and make routing decisions.  If that is not so, then this
   specification MUST NOT be used.

   This specification extends the Virtual Reassembly Buffer (VRB)
   technique to forward fragments with no intermediate reconstruction of
   the entire packet.  It inherits operations like datagram_tag
   Switching and using a timer to clean the VRB when the traffic dries
   up.  In more details, the first fragment carries the IP header and it
   is routed all the way from the fragmenting endpoint to the
   reassembling endpoint.  Upon the first fragment, the routers along
   the path install a label-switched path (LSP), and the following
   fragments are label-switched along that path.  As a consequence, the
   next fragments can only follow the path that was set up by the first
   fragment and cannot follow an alternate route.  The datagram_tag is
   used to carry the label, that is swapped at each hop.  All fragments
   follow the same path and fragments are delivered in the order at
   which they are sent.

6.1.1.  Upon the first fragment

   In Route-Over mode, the source and destination MAC addressed in a
   frame change at each hop.  The label that is formed and placed in the
   datagram_tag is associated to the source MAC and only valid (and
   unique) for that source MAC.  Upon a first fragment (i.e. with a
   sequence of zero), an intermediate router creates a VRB and the

Thubert                   Expires 25 April 2020                [Page 14]

Internet-Draft               Selective RFRAG                October 2019

   associated LSP state for the tuple (source MAC address, datagram_tag)
   and the fragment is forwarded along the IPv6 route that matches the
   destination IPv6 address in the IPv6 header as prescribed by
   [I-D.ietf-6lo-minimal-fragment], whereas the receiving endpoint
   allocates a reassembly buffer.  The LSP state enables to match the
   (previous MAC address, datagram_tag) in an incoming fragment to the
   tuple (next MAC address, swapped datagram_tag) used in the forwarded
   fragment and points at the VRB.  In addition, the router also forms a
   Reverse LSP state indexed by the MAC address of the next hop and the
   swapped datagram_tag.  This reverse LSP state also points at the VRB
   and enables to match the (next MAC address, swapped_datagram_tag)
   found in an RFRAG Acknowledgment to the tuple (previous MAC address,
   datagram_tag) used when forwarding a Fragment Acknowledgment (RFRAG-
   ACK) back to the sender endpoint.

6.1.2.  Upon the next fragments

   Upon a next fragment (i.e. with a non-zero sequence), an intermediate
   router looks up a LSP indexed by the tuple (MAC address,
   datagram_tag) found in the fragment.  If it is found, the router
   forwards the fragment using the associated VRB as prescribed by

   if the VRB for the tuple is not found, the router builds an RFRAG-ACK
   to abort the transmission of the packet.  The resulting message has
   the following information:

   *  The source and destination MAC addresses are swapped from those
      found in the fragment
   *  The datagram_tag set to the datagram_tag found in the fragment
   *  A NULL bitmap is used to signal the abort condition

   At this point the router is all set and can send the RFRAG-ACK back
   to the previous router.  The RFRAG-ACK should normally be forwarded
   all the way to the source using the reverse LSP state in the VRBs in
   the intermediate routers as described in the next section.

6.2.  Upon the RFRAG Acknowledgments

   Upon an RFRAG-ACK, the router looks up a Reverse LSP indexed by the
   tuple (MAC address, datagram_tag), which are respectively the source
   MAC address of the received frame and the received datagram_tag.  If
   it is found, the router forwards the fragment using the associated
   VRB as prescribed by [I-D.ietf-6lo-minimal-fragment], but using the
   Reverse LSP so that the RFRAG-ACK flows back to the sender endpoint.

   If the Reverse LSP is not found, the router MUST silently drop the
   RFRAG-ACK message.

Thubert                   Expires 25 April 2020                [Page 15]

Internet-Draft               Selective RFRAG                October 2019

   Either way, if the RFRAG-ACK indicates that the fragment was entirely
   received (FULL bitmap), it arms a short timer, and upon timeout, the
   VRB and all the associated state are destroyed.  Until the timer
   elapses, fragments of that datagram may still be received, e.g. if
   the RFRAG-ACK was lost on the way back and the source retried the
   last fragment.  In that case, the router forwards the fragment
   according to the state in the VRB.

   This specification does not provide a method to discover the number
   of hops or the minimal value of MTU along those hops.  But should the
   minimal MTU decrease, it is possible to retry a long fragment (say
   sequence of 5) with first a shorter fragment of the same sequence (5
   again) and then one or more other fragments with a sequence that was
   not used before (e.g., 13 and 14).  Note that Path MTU Discovery is
   out of scope for this document.

6.3.  Aborting the Transmission of a Fragmented Packet

   A reset is signaled on the forward path with a pseudo fragment that
   has the fragment_offset, sequence and fragment_size all set to 0, and
   no data.

   When the sender or a router on the way decides that a packet should
   be dropped and the fragmentation process aborted, it generates a
   reset pseudo fragment and forwards it down the fragment path.

   Each router next along the path the way forwards the pseudo fragment
   based on the VRB state.  If an acknowledgment is not requested, the
   VRB and all associated state are destroyed.

   Upon reception of the pseudo fragment, the receiver cleans up all
   resources for the packet associated to the datagram_tag.  If an
   acknowledgment is requested, the receiver responds with a NULL

   The other way around, the receiver might need to abort the process of
   a fragmented packet for internal reasons, for instance if it is out
   of reassembly buffers, or if it keeps receiving fragments beyond a
   reasonable time while it considers that this packet is already fully
   reassembled and was passed to the upper layer.  In that case, the
   receiver SHOULD indicate so to the sender with a NULL bitmap in a
   RFRAG Acknowledgment.  Upon an acknowledgment with a NULL bitmap, the
   sender endpoint MUST abort the transmission of the fragmented

Thubert                   Expires 25 April 2020                [Page 16]

Internet-Draft               Selective RFRAG                October 2019

7.  Management Considerations

7.1.  Protocol Parameters

   There is no particular configuration on the receiver, as echoing ECN
   is always on.  The configuration only applies to the sender, which is
   in control of the transmission.  The management system SHOULD be
   capable of providing the parameters below:

   MinFragmentSize:  The MinFragmentSize is the minimum value for the

   OptFragmentSize:  The MinFragmentSize is the value for the
      Fragment_Size that the sender should use to start with.

   MaxFragmentSize:  The MaxFragmentSize is the maximum value for the
      Fragment_Size.  It MUST be lower than the minimum MTU along the
      path.  A large value augments the chances of buffer bloat and
      transmission loss.  The value MUST be less than 512 if the unit
      that is defined for the PHY layer is the octet.

   UseECN:  Indicates whether the sender should react to ECN.  When the
      sender reacts to ECN the Window_Size will vary between
      MinWindowSize and MaxWindowSize.

   MinWindowSize:  The minimum value of Window_Size that the sender can

   OptWindowSize:  The OptWindowSize is the value for the Window_Size
      that the sender should use to start with.

   MaxWindowSize:  The maximum value of Window_Size that the sender can
      use.  The value MUSt be less than 32.

   InterFrameGap:  Indicates a minimum amount of time between
      transmissions.  All packets to a same destination, and in
      particular fragments, may be subject to receive while transmitting
      and hidden terminal collisions with the next or the previous
      transmission as the fragments progress along a same path.  The
      InterFrameGap protects the propagation of one transmission before
      the next one is triggered and creates a duty cycle that controls
      the ratio of air time and memory in intermediate nodes that a
      particular datagram will use.

   MinARQTimeOut:  The maximum amount of time a node should wait for an
      RFRAG Acknowledgment before it takes a next action.

Thubert                   Expires 25 April 2020                [Page 17]

Internet-Draft               Selective RFRAG                October 2019

   OptARQTimeOut:  The starting point of the value of the amount that a
      sender should wait for an RFRAG Acknowledgment before it takes a
      next action.

   MaxARQTimeOut:  The maximum amount of time a node should wait for an
      RFRAG Acknowledgment before it takes a next action.

   MaxFragRetries:  The maximum number of retries for a particular

   MaxDatagramRetries:  The maximum number of retries from scratch for a
      particular Datagram.

7.2.  Observing the network

   The management system should monitor the amount of retries and of ECN
   settings that can be observed from the perspective of the both the
   sender and the receiver, and may tune the optimum size of
   Fragment_Size and of the Window_Size, OptWindowSize and OptWindowSize
   respectively, at the sender.  The values should be bounded by the
   expected number of hops and reduced beyond that when the number of
   datagrams that can traverse an intermediate point may exceed its
   capacity and cause a congestion loss.  The InterFrameGap is another
   tool that can be used to increase the spacing between fragments of a
   same datagram and reduce the ratio of time when a particular
   intermediate node holds a fragment of that datagram.

8.  Security Considerations

   The considerations in the Security section of [I-D.ietf-core-cocoa]
   apply equally to this specification.

   The process of recovering fragments does not appear to create any
   opening for new threat compared to "Transmission of IPv6 Packets over
   IEEE 802.15.4 Networks" [RFC4944].

   The Virtual Recovery Buffer inherited from
   [I-D.ietf-6lo-minimal-fragment] may be used to perform a Denial-of-
   Service (DoS) attack against the intermediate Routers since the
   routers need to maintain a state per flow.  The VRB implementation
   technique described in [I-D.ietf-lwig-6lowpan-virtual-reassembly]
   allows to realign which data goes in which fragment which causes the
   intermediate node to store a portion of the data, which adds an
   attack vector that is not present with this specification.  With this
   specification, the data that is transported in each fragment is
   conserved and the state to keep does not include any data that would
   not fit in the previous fragment.

Thubert                   Expires 25 April 2020                [Page 18]

Internet-Draft               Selective RFRAG                October 2019

9.  IANA Considerations

   This document allocates 4 values in Page 0 for recoverable fragments
   from the "Dispatch Type Field" registry that was created by
   "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944]
   and reformatted by "6LoWPAN Paging Dispatch" [RFC8025].

   The suggested values (to be confirmed by IANA) are indicated in
   Table 1.

   | Bit Pattern | Page |           Header Type            | Reference |
   | 11 10100x   | 0    | RFRAG - Recoverable Fragment     | THIS RFC  |
   | 11 10101x   | 0    | RFRAG-ACK - RFRAG                | THIS RFC  |
   |             |      | Acknowledgment                   |           |

              Table 1: Additional Dispatch Value Bit Patterns

10.  Acknowledgments

   The author wishes to thank Michel Veillette, Dario Tedeschi, Laurent
   Toutain, Carles Gomez Montenegro, Thomas Watteyne and Michael
   Richardson for in-depth reviews and comments.  Also many thanks to
   Jonathan Hui, Jay Werb, Christos Polyzois, Soumitri Kolavennu, Pat
   Kinney, Margaret Wasserman, Richard Kelsey, Carsten Bormann and Harry
   Courtice for their various contributions.

11.  Normative References

              Watteyne, T., Bormann, C., and P. Thubert, "6LoWPAN
              Fragment Forwarding", Work in Progress, Internet-Draft,
              draft-ietf-6lo-minimal-fragment-04, 2 September 2019,

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,

   [RFC4944]  Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler,
              "Transmission of IPv6 Packets over IEEE 802.15.4
              Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007,

Thubert                   Expires 25 April 2020                [Page 19]

Internet-Draft               Selective RFRAG                October 2019

   [RFC6282]  Hui, J., Ed. and P. Thubert, "Compression Format for IPv6
              Datagrams over IEEE 802.15.4-Based Networks", RFC 6282,
              DOI 10.17487/RFC6282, September 2011,

   [RFC6554]  Hui, J., Vasseur, JP., Culler, D., and V. Manral, "An IPv6
              Routing Header for Source Routes with the Routing Protocol
              for Low-Power and Lossy Networks (RPL)", RFC 6554,
              DOI 10.17487/RFC6554, March 2012,

   [RFC8025]  Thubert, P., Ed. and R. Cragie, "IPv6 over Low-Power
              Wireless Personal Area Network (6LoWPAN) Paging Dispatch",
              RFC 8025, DOI 10.17487/RFC8025, November 2016,

   [RFC8138]  Thubert, P., Ed., Bormann, C., Toutain, L., and R. Cragie,
              "IPv6 over Low-Power Wireless Personal Area Network
              (6LoWPAN) Routing Header", RFC 8138, DOI 10.17487/RFC8138,
              April 2017, <https://www.rfc-editor.org/info/rfc8138>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

12.  Informative References

              Thubert, P., "An Architecture for IPv6 over the TSCH mode
              of IEEE 802.15.4", Work in Progress, Internet-Draft,
              draft-ietf-6tisch-architecture-27, 18 October 2019,

              Bormann, C., Betzler, A., Gomez, C., and I. Demirkol,
              "CoAP Simple Congestion Control/Advanced", Work in
              Progress, Internet-Draft, draft-ietf-core-cocoa-03, 21
              February 2018,

              Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O.,
              and F. Gont, "IP Fragmentation Considered Fragile", Work
              in Progress, Internet-Draft, draft-ietf-intarea-frag-
              fragile-17, 30 September 2019,

Thubert                   Expires 25 April 2020                [Page 20]

Internet-Draft               Selective RFRAG                October 2019

              Bormann, C. and T. Watteyne, "Virtual reassembly buffers
              in 6LoWPAN", Work in Progress, Internet-Draft, draft-ietf-
              lwig-6lowpan-virtual-reassembly-01, 11 March 2019,

              IEEE, "IEEE Standard for Low-Rate Wireless Networks",
              IEEE Standard 802.15.4, DOI 10.1109/IEEE
              P802.15.4-REVd/D01, October 2019,

   [Kent]     Kent, C. and J. Mogul, ""Fragmentation Considered
              Harmful", In Proc. SIGCOMM '87 Workshop on Frontiers in
              Computer Communications Technology",
              DOI 10.1145/55483.55524, August 1987,

   [RFC2914]  Floyd, S., "Congestion Control Principles", BCP 41,
              RFC 2914, DOI 10.17487/RFC2914, September 2000,

   [RFC3031]  Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol
              Label Switching Architecture", RFC 3031,
              DOI 10.17487/RFC3031, January 2001,

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP",
              RFC 3168, DOI 10.17487/RFC3168, September 2001,

   [RFC4919]  Kushalnagar, N., Montenegro, G., and C. Schumacher, "IPv6
              over Low-Power Wireless Personal Area Networks (6LoWPANs):
              Overview, Assumptions, Problem Statement, and Goals",
              RFC 4919, DOI 10.17487/RFC4919, August 2007,

   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
              Errors at High Data Rates", RFC 4963,
              DOI 10.17487/RFC4963, July 2007,

   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,

Thubert                   Expires 25 April 2020                [Page 21]

Internet-Draft               Selective RFRAG                October 2019

   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
              "Computing TCP's Retransmission Timer", RFC 6298,
              DOI 10.17487/RFC6298, June 2011,

   [RFC6550]  Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J.,
              Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur,
              JP., and R. Alexander, "RPL: IPv6 Routing Protocol for
              Low-Power and Lossy Networks", RFC 6550,
              DOI 10.17487/RFC6550, March 2012,

   [RFC6606]  Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem
              Statement and Requirements for IPv6 over Low-Power
              Wireless Personal Area Network (6LoWPAN) Routing",
              RFC 6606, DOI 10.17487/RFC6606, May 2012,

   [RFC7554]  Watteyne, T., Ed., Palattella, M., and L. Grieco, "Using
              IEEE 802.15.4e Time-Slotted Channel Hopping (TSCH) in the
              Internet of Things (IoT): Problem Statement", RFC 7554,
              DOI 10.17487/RFC7554, May 2015,

   [RFC7567]  Baker, F., Ed. and G. Fairhurst, Ed., "IETF
              Recommendations Regarding Active Queue Management",
              BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,

   [RFC8085]  Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
              Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
              March 2017, <https://www.rfc-editor.org/info/rfc8085>.

   [RFC8087]  Fairhurst, G. and M. Welzl, "The Benefits of Using
              Explicit Congestion Notification (ECN)", RFC 8087,
              DOI 10.17487/RFC8087, March 2017,

   [RFC8200]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
              (IPv6) Specification", STD 86, RFC 8200,
              DOI 10.17487/RFC8200, July 2017,

   [RFC8201]  McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed.,
              "Path MTU Discovery for IP version 6", STD 87, RFC 8201,
              DOI 10.17487/RFC8201, July 2017,

Thubert                   Expires 25 April 2020                [Page 22]

Internet-Draft               Selective RFRAG                October 2019

Appendix A.  Rationale

   There are a number of uses for large packets in Wireless Sensor
   Networks.  Such usages may not be the most typical or represent the
   largest amount of traffic over the LLN; however, the associated
   functionality can be critical enough to justify extra care for
   ensuring effective transport of large packets across the LLN.

   The list of those usages includes:

   Towards the LLN node:  Firmware update:  For example, a new version
         of the LLN node software is downloaded from a system manager
         over unicast or multicast services.  Such a reflashing
         operation typically involves updating a large number of similar
         LLN nodes over a relatively short period of time.

                          Packages of Commands:  A number of commands or
         a full configuration can be packaged as a single message to
         ensure consistency and enable atomic execution or complete roll
         back.  Until such commands are fully received and interpreted,
         the intended operation will not take effect.

   From the LLN node:  Waveform captures:  A number of consecutive
         samples are measured at a high rate for a short time and then
         transferred from a sensor to a gateway or an edge server as a
         single large report.

                       Data logs:  LLN nodes may generate large logs of
         sampled data for later extraction.  LLN nodes may also generate
         system logs to assist in diagnosing problems on the node or

                       Large data packets:  Rich data types might
         require more than one fragment.

   Uncontrolled firmware download or waveform upload can easily result
   in a massive increase of the traffic and saturate the network.

   When a fragment is lost in transmission, the lack of recovery in the
   original fragmentation system of RFC 4944 implies that all fragments
   would need to be resent, further contributing to the congestion that
   caused the initial loss, and potentially leading to congestion

   This saturation may lead to excessive radio interference, or random
   early discard (leaky bucket) in relaying nodes.  Additional queuing
   and memory congestion may result while waiting for a low power next
   hop to emerge from its sleeping state.

Thubert                   Expires 25 April 2020                [Page 23]

Internet-Draft               Selective RFRAG                October 2019

   Considering that RFC 4944 defines an MTU is 1280 bytes and that in
   most incarnations (but 802.15.4g) a IEEE Std. 802.15.4 frame can
   limit the MAC payload to as few as 74 bytes, a packet might be
   fragmented into at least 18 fragments at the 6LoWPAN shim layer.
   Taking into account the worst-case header overhead for 6LoWPAN
   Fragmentation and Mesh Addressing headers will increase the number of
   required fragments to around 32.  This level of fragmentation is much
   higher than that traditionally experienced over the Internet with
   IPv4 fragments.  At the same time, the use of radios increases the
   probability of transmission loss and Mesh-Under techniques compound
   that risk over multiple hops.

   Mechanisms such as TCP or application-layer segmentation could be
   used to support end-to-end reliable transport.  One option to support
   bulk data transfer over a frame-size-constrained LLN is to set the
   Maximum Segment Size to fit within the link maximum frame size.
   Doing so, however, can add significant header overhead to each
   802.15.4 frame.  In addition, deploying such a mechanism requires
   that the end-to-end transport is aware of the delivery properties of
   the underlying LLN, which is a layer violation, and difficult to
   achieve from the far end of the IPv6 network.

Appendix B.  Requirements

   For one-hop communications, a number of Low Power and Lossy Network
   (LLN) link-layers propose a local acknowledgment mechanism that is
   enough to detect and recover the loss of fragments.  In a multihop
   environment, an end-to-end fragment recovery mechanism might be a
   good complement to a hop-by-hop MAC level recovery.  This draft
   introduces a simple protocol to recover individual fragments between
   6LoWPAN endpoints that may be multiple hops away.  The method
   addresses the following requirements of a LLN:

   Number of fragments  The recovery mechanism must support highly
      fragmented packets, with a maximum of 32 fragments per packet.

   Minimum acknowledgment overhead  Because the radio is half duplex,
      and because of silent time spent in the various medium access
      mechanisms, an acknowledgment consumes roughly as many resources
      as data fragment.

      The new end-to-end fragment recovery mechanism should be able to
      acknowledge multiple fragments in a single message and not require
      an acknowledgment at all if fragments are already protected at a
      lower layer.

   Controlled latency  The recovery mechanism must succeed or give up

Thubert                   Expires 25 April 2020                [Page 24]

Internet-Draft               Selective RFRAG                October 2019

      within the time boundary imposed by the recovery process of the
      Upper Layer Protocols.

   Optional congestion control  The aggregation of multiple concurrent
      flows may lead to the saturation of the radio network and
      congestion collapse.

      The recovery mechanism should provide means for controlling the
      number of fragments in transit over the LLN.

Appendix C.  Considerations On Flow Control

   Considering that a multi-hop LLN can be a very sensitive environment
   due to the limited queuing capabilities of a large population of its
   nodes, this draft recommends a simple and conservative approach to
   Congestion Control, based on TCP congestion avoidance.

   Congestion on the forward path is assumed in case of packet loss, and
   packet loss is assumed upon time out.  The draft allows to control
   the number of outstanding fragments, that have been transmitted but
   for which an acknowledgment was not received yet.  It must be noted
   that the number of outstanding fragments should not exceed the number
   of hops in the network, but the way to figure the number of hops is
   out of scope for this document.

   Congestion on the forward path can also be indicated by an Explicit
   Congestion Notification (ECN) mechanism.  Though whether and how ECN
   [RFC3168] is carried out over the LoWPAN is out of scope, this draft
   provides a way for the destination endpoint to echo an ECN indication
   back to the source endpoint in an acknowledgment message as
   represented in Figure 4 in Section 5.2.

   It must be noted that congestion and collision are different topics.
   In particular, when a mesh operates on a same channel over multiple
   hops, then the forwarding of a fragment over a certain hop may
   collide with the forwarding of a next fragment that is following over
   a previous hop but in a same interference domain.  This draft enables
   an end-to-end flow control, but leaves it to the sender stack to pace
   individual fragments within a transmit window, so that a given
   fragment is sent only when the previous fragment has had a chance to
   progress beyond the interference domain of this hop.  In the case of
   6TiSCH [I-D.ietf-6tisch-architecture], which operates over the
   TimeSlotted Channel Hopping [RFC7554] (TSCH) mode of operation of
   IEEE802.14.5, a fragment is forwarded over a different channel at a
   different time and it makes full sense to transmit the next fragment
   as soon as the previous fragment has had its chance to be forwarded
   at the next hop.

Thubert                   Expires 25 April 2020                [Page 25]

Internet-Draft               Selective RFRAG                October 2019

   From the standpoint of a source 6LoWPAN endpoint, an outstanding
   fragment is a fragment that was sent but for which no explicit
   acknowledgment was received yet.  This means that the fragment might
   be on the way, received but not yet acknowledged, or the
   acknowledgment might be on the way back.  It is also possible that
   either the fragment or the acknowledgment was lost on the way.

   From the sender standpoint, all outstanding fragments might still be
   in the network and contribute to its congestion.  There is an
   assumption, though, that after a certain amount of time, a frame is
   either received or lost, so it is not causing congestion anymore.
   This amount of time can be estimated based on the round trip delay
   between the 6LoWPAN endpoints.  The method detailed in [RFC6298] is
   recommended for that computation.

   The reader is encouraged to read through "Congestion Control
   Principles" [RFC2914].  Additionally [RFC7567] and [RFC5681] provide
   deeper information on why this mechanism is needed and how TCP
   handles Congestion Control.  Basically, the goal here is to manage
   the amount of fragments present in the network; this is achieved by
   to reducing the number of outstanding fragments over a congested path
   by throttling the sources.

   Section 6 describes how the sender decides how many fragments are
   (re)sent before an acknowledgment is required, and how the sender
   adapts that number to the network conditions.

Author's Address

   Pascal Thubert (editor)
   Cisco Systems, Inc
   Building D, 45 Allee des Ormes - BP1200
   06254 MOUGINS - Sophia Antipolis

   Phone: +33 497 23 26 34
   Email: pthubert@cisco.com

Thubert                   Expires 25 April 2020                [Page 26]