Skip to main content

IPv6 over Low-Power Wireless Personal Area Network (6LoWPAN) Selective Fragment Recovery

The information below is for an old version of the document that is already published as an RFC.
Document Type
This is an older version of an Internet-Draft that was ultimately published as RFC 8931.
Author Pascal Thubert
Last updated 2020-11-23 (Latest revision 2020-03-23)
Replaces draft-thubert-6lo-fragment-recovery
RFC stream Internet Engineering Task Force (IETF)
Intended RFC status Proposed Standard
Additional resources Mailing list discussion
Stream WG state Submitted to IESG for Publication
Document shepherd Carles Gomez
Shepherd write-up Show Last changed 2019-10-23
IESG IESG state Became RFC 8931 (Proposed Standard)
Action Holders
Consensus boilerplate Yes
Telechat date (None)
Responsible AD Suresh Krishnan
Send notices to Carles Gomez <>
IANA IANA review state Version Changed - Review Needed
IANA action state RFC-Ed-Ack
IANA expert review state Expert Reviews OK
IANA expert review comments The expert approved the registrations, but wrote, "One slight concern is that it registers not 1 or 2 codes, but 4. However, rfc8025 was created precisely because of this problem, as these registrations are limited to page 0 only."
6lo                                                      P. Thubert, Ed.
Internet-Draft                                             Cisco Systems
Updates: 4944 (if approved)                                23 March 2020
Intended status: Standards Track                                        
Expires: 24 September 2020

                  6LoWPAN Selective Fragment Recovery


   This draft updates RFC 4944 with a protocol to forward individual
   fragments across a route-over mesh and recover them end-to-end, with
   congestion control capabilities to protect the network.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 24 September 2020.

Copyright Notice

   Copyright (c) 2020 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Simplified BSD License text
   as described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Simplified BSD License.

Thubert                 Expires 24 September 2020               [Page 1]
Internet-Draft               Selective RFRAG                  March 2020

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.1.  BCP 14  . . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.2.  References  . . . . . . . . . . . . . . . . . . . . . . .   4
     2.3.  Other Terms . . . . . . . . . . . . . . . . . . . . . . .   5
   3.  Updating RFC 4944 . . . . . . . . . . . . . . . . . . . . . .   6
   4.  Extending draft-ietf-6lo-minimal-fragment . . . . . . . . . .   6
     4.1.  Slack in the First Fragment . . . . . . . . . . . . . . .   6
     4.2.  Gap between frames  . . . . . . . . . . . . . . . . . . .   7
     4.3.  congestion Control  . . . . . . . . . . . . . . . . . . .   7
     4.4.  Modifying the First Fragment  . . . . . . . . . . . . . .   8
   5.  New Dispatch types and headers  . . . . . . . . . . . . . . .   8
     5.1.  Recoverable Fragment Dispatch type and Header . . . . . .   9
     5.2.  RFRAG Acknowledgment Dispatch type and Header . . . . . .  11
   6.  Fragment Recovery . . . . . . . . . . . . . . . . . . . . . .  12
     6.1.  Forwarding Fragments  . . . . . . . . . . . . . . . . . .  15
       6.1.1.  Receiving the first fragment  . . . . . . . . . . . .  15
       6.1.2.  Receiving the next fragments  . . . . . . . . . . . .  16
     6.2.  Receiving RFRAG Acknowledgments . . . . . . . . . . . . .  16
     6.3.  Aborting the Transmission of a Fragmented Packet  . . . .  17
     6.4.  Applying Recoverable Fragmentation along a Diverse
           Path  . . . . . . . . . . . . . . . . . . . . . . . . . .  18
   7.  Management Considerations . . . . . . . . . . . . . . . . . .  18
     7.1.  Protocol Parameters . . . . . . . . . . . . . . . . . . .  19
     7.2.  Observing the network . . . . . . . . . . . . . . . . . .  21
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  21
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  22
   10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  23
   11. Normative References  . . . . . . . . . . . . . . . . . . . .  23
   12. Informative References  . . . . . . . . . . . . . . . . . . .  24
   Appendix A.  Rationale  . . . . . . . . . . . . . . . . . . . . .  27
   Appendix B.  Requirements . . . . . . . . . . . . . . . . . . . .  28
   Appendix C.  Considerations on Congestion Control . . . . . . . .  29
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  32

1.  Introduction

   In most Low Power and Lossy Network (LLN) applications, the bulk of
   the traffic consists of small chunks of data (on the order of a few
   bytes to a few tens of bytes) at a time.  Given that an IEEE Std.
   802.15.4 [IEEE.802.15.4] frame can carry a payload of 74 bytes or
   more, fragmentation is usually not required.  However, and though
   this happens only occasionally, a number of mission critical
   applications do require the capability to transfer larger chunks of
   data, for instance to support the firmware upgrade of the LLN nodes
   or the extraction of logs from LLN nodes.

Thubert                 Expires 24 September 2020               [Page 2]
Internet-Draft               Selective RFRAG                  March 2020

   In the former case, the large chunk of data is transferred to the LLN
   node, whereas in the latter, the large chunk flows away from the LLN
   node.  In both cases, the size can be on the order of 10 kilobytes or
   more and an end-to-end reliable transport is required.

   "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944]
   defines the original 6LoWPAN datagram fragmentation mechanism for
   LLNs.  One critical issue with this original design is that routing
   an IPv6 [RFC8200] packet across a route-over mesh requires the
   reassembly of the packet at each hop.  The "6TiSCH Architecture"
   [I-D.ietf-6tisch-architecture] indicates that this may cause latency
   along a path and impact critical resources such as memory and
   battery; to alleviate those undesirable effects it recommends using a
   6LoWPAN Fragment Forwarding (6FF) technique .

   "LLN Minimal Fragment Forwarding" [FRAG-FWD] specifies the generic
   behavior that all 6FF techniques including this specification follow,
   and presents the associated caveats.  In particular, the routing
   information is fully indicated in the first fragment, which is always
   forwarded first.  With this specification, the first fragment is
   identified by a Sequence of 0 as opposed to a dispatch type in
   [RFC4944].  A state is formed and used to forward all the next
   fragments along the same path.  The Datagram_Tag is locally
   significant to the Layer-2 source of the packet and is swapped at
   each hop, more in Section 6.  This specification encodes the
   Datagram_Tag in one byte, which will saturate if more than 256
   datagrams transit in fragmented form over a single hop at the same
   time.  This is not realistic at the time of this writing.  Should
   this happen in a new 6LoWPAN technology, a node will need to use
   several Link-Layer addresses to increase its indexing capacity.

   "Virtual reassembly buffers in 6LoWPAN" [LWIG-FRAG](VRB) proposes a
   6FF technique that is compatible with [RFC4944] without the need to
   define a new protocol.  However, adding that capability alone to the
   local implementation of the original 6LoWPAN fragmentation would not
   address the inherent fragility of fragmentation (see [FRAG-ILE]) in
   particular the issues of resources locked on the reassembling
   endpoint and the wasted transmissions due to the loss of a single
   fragment in a whole datagram.  [Kent] compares the unreliable
   delivery of fragments with a mechanism it calls "selective
   acknowledgements" that recovers the loss of a fragment individually.
   The paper illustrates the benefits that can be derived from such a
   method in figures 1, 2 and 3, on pages 6 and 7.  [RFC4944] has no
   selective recovery and the whole datagram fails when one fragment is
   not delivered to the reassembling endpoint.  Constrained memory
   resources are blocked on the reassembling endpoint until it times
   out, possibly causing the loss of subsequent packets that cannot be
   received for the lack of buffers.

Thubert                 Expires 24 September 2020               [Page 3]
Internet-Draft               Selective RFRAG                  March 2020

   That problem is exacerbated when forwarding fragments over multiple
   hops since a loss at an intermediate hop will not be discovered by
   either the fragmenting and reassembling endpoints, and the source
   will keep on sending fragments, wasting even more resources in the
   network since the datagram cannot arrive in its entirety, and
   possibly contributing to the condition that caused the loss.
   [RFC4944] is lacking a congestion control to avoid participating in a
   saturation that may have caused the loss of the fragment.  It has no
   signaling to abort a multi-fragment transmission at any time and from
   either end, and, if the capability to forward fragments is
   implemented, clean up the related state in the network.

   This specification provides a method to forward fragments over
   typically a few hops in a route-over 6LoWPAN mesh, and a selective
   acknowledgment to recover individual fragments between 6LoWPAN
   endpoints.  The method can help limit the congestion loss in the
   network and addresses the requirements in Appendix B.  Flow Control
   is out of scope since the endpoints are expected to be able to store
   the full datagram.  Deployments are expected to be managed and
   homogeneous, and an incremental transition requires a flag day.

2.  Terminology

2.1.  BCP 14

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119][RFC8174] when, and only when, they appear in all
   capitals, as shown here.

2.2.  References

   This document uses 6LoWPAN terms and concepts that are presented in
   "IPv6 over Low-Power Wireless Personal Area Networks (6LoWPANs):
   Overview, Assumptions, Problem Statement, and Goals" [RFC4919],
   "Transmission of IPv6 Packets over IEEE 802.15.4 Networks" [RFC4944],
   and "Problem Statement and Requirements for IPv6 over Low-Power
   Wireless Personal Area Network (6LoWPAN) Routing" [RFC6606].

   "LLN Minimal Fragment Forwarding" [FRAG-FWD] discusses the generic
   concept of a Virtual Reassembly Buffer (VRB) and specifies behaviors
   and caveats that are common to a large family of 6FF techniques
   including the mechanism specified by this document, which fully
   inherits from that specification.  It also defines terms used in this
   document: Compressed Form, Datagram_Tag, Datagram_Size,
   Fragment_Offset, and 6LoWPAN Fragment Forwarding endpoint (commonly
   abbreviated as only "endpoint").

Thubert                 Expires 24 September 2020               [Page 4]
Internet-Draft               Selective RFRAG                  March 2020

   Past experience with fragmentation has shown that misassociated or
   lost fragments can lead to poor network behavior and, occasionally,
   trouble at the application layer.  The reader is encouraged to read
   "IPv4 Reassembly Errors at High Data Rates" [RFC4963] and follow the
   references for more information.  That experience led to the
   definition of "Path MTU discovery" [RFC8201] (PMTUD) protocol that
   limits fragmentation over the Internet.  Specifically in the case of
   UDP, valuable additional information can be found in "UDP Usage
   Guidelines for Application Designers" [RFC8085].

   "The Benefits of Using Explicit Congestion Notification (ECN)"
   [RFC8087] provides useful information on the potential benefits and
   pitfalls of using ECN.

   Quoting the "Multiprotocol Label Switching (MPLS) Architecture"
   [RFC3031]: with MPLS, 'packets are "labeled" before they are
   forwarded' along a Label Switched Path (LSP).  At subsequent hops,
   there is no further analysis of the packet's network layer header.
   Rather, the label is used as an index into a table which specifies
   the next hop, and a new label".  [FRAG-FWD] leverages MPLS to forward
   fragments that actually do not have a network layer header, since the
   fragmentation occurs below IP, and this specification makes it
   reversible so the reverse path can be followed as well.

2.3.  Other Terms

   This specification uses the following terms:

   RFRAG:  Recoverable Fragment

   RFRAG-ACK:  Recoverable Fragment Acknowledgement

   RFRAG Acknowledgment Request:  An RFRAG with the Acknowledgement
      Request flag ('X' flag) set.

   NULL bitmap:  Refers to a bitmap with all bits set to zero.

   FULL bitmap:  Refers to a bitmap with all bits set to one.

   Reassembling endpoint:  The receiving endpoint

   Fragmenting endpoint:  The sending endpoint

   Forward direction:  The direction of a path, which is followed by the

   Reverse direction:  The reverse direction of a path, which is taken
      by the RFRAG-ACK.

Thubert                 Expires 24 September 2020               [Page 5]
Internet-Draft               Selective RFRAG                  March 2020

3.  Updating RFC 4944

   This specification updates the fragmentation mechanism that is
   specified in "Transmission of IPv6 Packets over IEEE 802.15.4
   Networks" [RFC4944] for use in route-over LLNs by providing a model
   where fragments can be forwarded end-to-end across a 6LoWPAN LLN, and
   where fragments that are lost on the way can be recovered
   individually.  A new format for fragments is introduced and new
   dispatch types are defined in Section 5.

   [RFC8138] allows modifying the size of a packet en route by removing
   the consumed hops in a compressed Routing Header.  This requires that
   Fragment_Offset and Datagram_Size (see Section 2.3) are also modified
   en route, which is difficult to do in the uncompressed form.  This
   specification expresses those fields in the Compressed Form and
   allows modifying them en route (see Section 4.4) easily.

   Consistently with Section 2 of [RFC6282], for the fragmentation
   mechanism described in Section 5.3 of [RFC4944], any header that
   cannot fit within the first fragment MUST NOT be compressed when
   using the fragmentation mechanism described in this specification.

4.  Extending draft-ietf-6lo-minimal-fragment

   This specification implements the generic 6FF technique defined in
   "LLN Minimal Fragment Forwarding" [FRAG-FWD], provides end-to-end
   fragment recovery and congestion control mechanisms.

4.1.  Slack in the First Fragment

   [FRAG-FWD] allows for refragmenting in intermediate nodes, meaning
   that some bytes from a given fragment may be left in the VRB to be
   added to the next fragment.  The need for more space in the outgoing
   fragment than was needed for the incoming fragment arises when the
   6LoWPAN Header Compression is not as efficient on the outgoing link
   or the Link MTU is reduced.

   This specification cannot allow such a refragmentation operation
   since the fragments are recovered end-to-end based on a sequence
   number.  The Fragment_Size MUST be tailored to fit the minimal MTU
   along the path, and the first fragment that contains a 6LoWPAN-
   compressed header MUST have enough slack to enable a less efficient
   compression in the next hops to still fits within the Link MTU.  If
   the fragmenting endpoint is also the 6LoWPAN compression endpoint, it
   will elide the IID of the source IPv6 address if it matches the Link-
   Layer address [RFC6282].  In a network with a consistent MTU, it MUST
   compute the Fragment_Size as if the MTU was 8 bytes less, so the next
   hop can expand the IID within the same fragment.

Thubert                 Expires 24 September 2020               [Page 6]
Internet-Draft               Selective RFRAG                  March 2020

4.2.  Gap between frames

   [FRAG-FWD] requires that a configurable interval of time is inserted
   between transmissions to the same next hop and in particular between
   fragments of a same datagram.  In the case of half duplex interfaces,
   this inter-frame gap ensures that the next hop is done forwarding the
   previous frame and is capable of receiving the next one.

   In the case of a mesh operating at a single frequency with
   omnidirectional antennas, a larger inter-frame gap is required to
   protect the frame against hidden terminal collisions with the
   previous frame of the same flow that is still progressing along a
   common path.

   The inter-frame gap is useful even for unfragmented datagrams, but it
   becomes a necessity for fragments that are typically generated in a
   fast sequence and are all sent over the exact same path.

4.3.  congestion Control

   The inter-frame gap is the only protection that [FRAG-FWD] imposes by
   default.  This document enables to group fragments in windows and
   request intermediate acknowledgements so the number of in-flight
   fragments can be bounded.  This document also adds an ECN mechanism
   that can be used to to protect the network by adapting the size of
   the window, the size of the fragments, and/or the inter-frame gap.

   This specification enables the fragmenting endpoint to apply a
   congestion control mechanism to tune those parameters, but the
   mechanism itself is out of scope.  In most cases, the expectation is
   that most datagrams will require only a few fragments, and that only
   the last fragment will be acknowledged.  A basic implementation of
   the fragmenting endpoint is NOT REQUIRED to vary the size of the
   window, the duration of the inter-frame gap or the size of a fragment
   in the middle of the transmission of a datagram, and it MAY ignore
   the ECN signal or simply reset the window to 1 (see Appendix C for
   more) until the end of this datagram upon detecting a congestion.

   An intermediate node that experiences a congestion MAY set the ECN
   bit in a fragment, and the reassembling endpoint echoes the ECN bit
   at most once at the next opportunity to acknowledge back.

   The size of the fragments is typically computed from the Link MTU to
   maximize the size of the resulting frames.  The size of the window
   and the duration of the inter-frame gap SHOULD be configurable, to
   reduce the chances of congestion and to follow the general
   recommendations in [FRAG-FWD], respectively.

Thubert                 Expires 24 September 2020               [Page 7]
Internet-Draft               Selective RFRAG                  March 2020

4.4.  Modifying the First Fragment

   The compression of the Hop Limit, of the source and destination
   addresses in the IPv6 Header, and of the Routing Header may change en
   route in a Route-Over mesh LLN.  If the size of the first fragment is
   modified, then the intermediate node MUST adapt the Datagram_Size,
   encoded in the Fragment_Size field, to reflect that difference.

   The intermediate node MUST also save the difference of Datagram_Size
   of the first fragment in the VRB and add it to the Fragment_Offset of
   all the subsequent fragments that it forwards for that datagram.

5.  New Dispatch types and headers

   This document specifies an alternative to the 6LoWPAN fragmentation
   sublayer [RFC4944] to emulate an Link MTU up to 2048 bytes for the
   upper layer, which can be the 6LoWPAN Header Compression sublayer
   that is defined in the "Compression Format for IPv6 Datagrams"
   [RFC6282] specification.  This specification also provides a reliable
   transmission of the fragments over a multihop 6LoWPAN route-over mesh
   network and a minimal congestion control to reduce the chances of
   congestion loss.

   A 6LoWPAN Fragment Forwarding [FRAG-FWD] technique derived from MPLS
   enables the forwarding of individual fragments across a 6LoWPAN
   route-over mesh without reassembly at each hop.  The Datagram_Tag is
   used as a label; it is locally unique to the node that owns the
   source Link-Layer address of the fragment, so together the Link-Layer
   address and the label can identify the fragment globally within the
   lifetime of the datagram.  A node may build the Datagram_Tag in its
   own locally-significant way, as long as the chosen Datagram_Tag stays
   unique to the particular datagram for its lifetime.  The result is
   that the label does not need to be globally unique but also that it
   must be swapped at each hop as the source Link-Layer address changes.

   In the following sections, a "Datagram_Tag" extends the semantics
   defined in [RFC4944] Section 5.3."Fragmentation Type and Header".
   The Datagram_Tag is a locally unique identifier for the datagram from
   the perspective of the sender.  This means that the Datagram_Tag
   identifies a datagram uniquely in the network when associated with
   the source of the datagram.  As the datagram gets forwarded, the
   source changes and the Datagram_Tag must be swapped as detailed in

   This specification extends RFC 4944 [RFC4944] with 2 new Dispatch
   types, for Recoverable Fragment (RFRAG) and for the RFRAG
   Acknowledgment back.  The new 6LoWPAN Dispatch types are taken from
   Page 0 [RFC8025] as indicated in Table 1 in Section 9.

Thubert                 Expires 24 September 2020               [Page 8]
Internet-Draft               Selective RFRAG                  March 2020

5.1.  Recoverable Fragment Dispatch type and Header

   In this specification, if the packet is compressed then the size and
   offset of the fragments are expressed with respect to the Compressed
   Form of the packet form as opposed to the uncompressed (native) form.

   The format of the fragment header is shown in Figure 1.  It is the
   same for all fragments though the Fragment_Offset is overloaded.  The
   format has a length and an offset, as well as a Sequence field.  This
   would be redundant if the offset was computed as the product of the
   Sequence by the length, but this is not the case.  The position of a
   fragment in the reassembly buffer is neither correlated with the
   value of the Sequence field nor with the order in which the fragments
   are received.  This enables refragmenting to cope with an MTU
   deduction, see the example of the fragment seq. 5 that is retried
   end-to-end as smaller fragments seq. 13 and 14 in Section 6.2.

   The first fragment is recognized by a Sequence of 0; it carries its
   Fragment_Size and the Datagram_Size of the compressed packet before
   it is fragmented, whereas the other fragments carry their
   Fragment_Size and Fragment_Offset.  The last fragment for a datagram
   is recognized when its Fragment_Offset and its Fragment_Size add up
   to the stored Datagram_Size of the packet identified by the sender
   Link-Layer address and the Datagram_Tag.

                             1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
                                       |1 1 1 0 1 0 0|E|  Datagram_Tag |
       |X| Sequence|   Fragment_Size   |       Fragment_Offset         |

                                                X set == Ack-Request

                  Figure 1: RFRAG Dispatch type and Header

   X:  1 bit; Ack-Request: when set, the fragmenting endpoint requires
      an RFRAG Acknowledgment from the reassembling endpoint.

   E:  1 bit; Explicit Congestion Notification; the "E" flag is cleared
      by the source of the fragment and set by intermediate routers to
      signal that this fragment experienced congestion along its path.

   Fragment_Size:  10-bit unsigned integer; the size of this fragment in
      a unit that depends on the Link-Layer technology.  Unless
      overridden by a more specific specification, that unit is the
      byte, which allows fragments up to 1024 bytes.

Thubert                 Expires 24 September 2020               [Page 9]
Internet-Draft               Selective RFRAG                  March 2020

   Datagram_Tag:  8 bits; an identifier of the datagram that is locally
      unique to the Link-Layer sender.

   Sequence:  5-bit unsigned integer; the sequence number of the
      fragment in the acknowledgement bitmap.  Fragments are numbered
      [0..N] where N is in [0..31].  A Sequence of 0 indicates the first
      fragment in a datagram, but non-zero values are not indicative of
      the position in the reassembly buffer.

   Fragment_Offset:  16-bit unsigned integer.

      When the Fragment_Offset is set to a non-0 value, its semantics
      depend on the value of the Sequence field as follows:

      *  For a first fragment (i.e., with a Sequence of 0), this field
         indicates the Datagram_Size of the compressed datagram, to help
         the reassembling endpoint allocate an adapted buffer for the
         reception and reassembly operations.  The fragment may be
         stored for local reassembly.  Alternatively, it may be routed
         based on the destination IPv6 address.  In that case, a VRB
         state must be installed as described in Section 6.1.1.
      *  When the Sequence is not 0, this field indicates the offset of
         the fragment in the Compressed Form of the datagram.  The
         fragment may be added to a local reassembly buffer or forwarded
         based on an existing VRB as described in Section 6.1.2.

      A Fragment_Offset that is set to a value of 0 indicates an abort
      condition and all state regarding the datagram should be cleaned
      up once the processing of the fragment is complete; the processing
      of the fragment depends on whether there is a VRB already
      established for this datagram, and the next hop is still

      *  if a VRB already exists and the next hop is still reachable,
         the fragment is to be forwarded along the associated Label
         Switched Path (LSP) as described in Section 6.1.2, without
         checking the value of the Sequence field;
      *  else, if the Sequence is 0, then the fragment is to be routed
         as described in Section 6.1.1, but no state is conserved
         afterwards.  In that case, the session if it exists is aborted
         and the packet is also forwarded in an attempt to clean up the
         next hops along the path indicated by the IPv6 header (possibly
         including a routing header).
      *  else (the Sequence is nonzero and either no VRB exists or the
         next hop is unavailable), the fragment cannot be forwarded or
         routed; the fragment is discarded and an abort RFRAG-ACK is
         sent back to the source as described in Section 6.1.2.

Thubert                 Expires 24 September 2020              [Page 10]
Internet-Draft               Selective RFRAG                  March 2020

   There is no requirement on the reassembling endpoint to check that
   the received fragments are consecutive and non-overlapping.  The
   fragmenting endpoint knows that the datagram is fully received when
   the acknowledged fragments cover the whole datagram, which is always
   the case with a FULL bitmap.  This may be useful in particular in the
   case where the MTU changes and a fragment Sequence is retried with a
   smaller Fragment_Size, the remainder of the original fragment being
   retried with new Sequence values.

   Recoverable Fragments are sequenced and a bitmap is used in the RFRAG
   Acknowledgment to indicate the received fragments by setting the
   individual bits that correspond to their sequence.

5.2.  RFRAG Acknowledgment Dispatch type and Header

   This specification also defines a 4-byte RFRAG Acknowledgment bitmap
   that is used by the reassembling endpoint to confirm selectively the
   reception of individual fragments.  A given offset in the bitmap maps
   one-to-one with a given sequence number and indicates which fragment
   is acknowledged as follows:

                            1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       |           RFRAG Acknowledgment Bitmap                         |
        ^                 ^
        |                 |    bitmap indicating whether:
        |                 +----- Fragment with Sequence 9 was received
        +----------------------- Fragment with Sequence 0 was received

               Figure 2: RFRAG Acknowledgment Bitmap Encoding

   Figure 3 shows an example Acknowledgment bitmap which indicates that
   all fragments from Sequence 0 to 20 were received, except for
   fragments 1, 2 and 16 were lost and must be retried.

                            1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

               Figure 3: Example RFRAG Acknowledgment Bitmap

   The RFRAG Acknowledgment Bitmap is included in an RFRAG
   Acknowledgment header, as follows:

Thubert                 Expires 24 September 2020              [Page 11]
Internet-Draft               Selective RFRAG                  March 2020

                            1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
                                       |1 1 1 0 1 0 1|E|  Datagram_Tag |
       |          RFRAG Acknowledgment Bitmap (32 bits)                |

          Figure 4: RFRAG Acknowledgment Dispatch type and Header

   E:  1 bit; Explicit Congestion Notification Echo

      When set, the fragmenting endpoint indicates that at least one of
      the acknowledged fragments was received with an Explicit
      Congestion Notification, indicating that the path followed by the
      fragments is subject to congestion.  More in Appendix C.

   Datagram_Tag:  8 bits; an identifier of the datagram that is locally
      unique to the Link-Layer recipient.

   RFRAG Acknowledgment Bitmap:  An RFRAG Acknowledgment Bitmap, whereby
      setting the bit at offset x indicates that fragment x was
      received, as shown in Figure 2.  A NULL bitmap indicates that the
      fragmentation process is aborted.  A FULL bitmap indicates that
      the fragmentation process is complete; all fragments were received
      at the reassembly endpoint.

6.  Fragment Recovery

   The Recoverable Fragment header RFRAG is used to transport a fragment
   and optionally request an RFRAG Acknowledgment RFRAG_ACK that
   confirms the reception of one or more fragments.  An RFRAG_ACK is
   carried as a standalone fragment header (i.e., with no 6LoWPAN
   payload) in a message that is propagated back to the fragmenting
   endpoint.  To achieve this, each hop that performed an MPLS-like
   operation on fragments reverses that operation for the RFRAG_ACK by
   sending a frame from the next hop to the previous hop as known by its
   Link-Layer address in the VRB.  The Datagram_Tag in the RFRAG_ACK is
   unique to the reassembling endpoint and is enough information for an
   intermediate hop to locate the VRB that contains the Datagram_Tag
   used by the previous hop and the Layer-2 information associated with
   it (interface and Link-Layer address).

   The fragmenting endpoint (i.e., the node fragments the packets at the
   6LoWPAN level) also controls the number of acknowledgments by setting
   the Ack-Request flag in the RFRAG packets.

Thubert                 Expires 24 September 2020              [Page 12]
Internet-Draft               Selective RFRAG                  March 2020

   The fragmenting endpoint may set the Ack-Request flag on any fragment
   to perform congestion control by limiting the number of outstanding
   fragments, which are the fragments that have been sent but for which
   reception or loss was not positively confirmed by the reassembling
   endpoint.  The maximum number of outstanding fragments is controlled
   by the Window-Size.  It is configurable and may vary in case of ECN
   notification.  When the endpoint that reassembles the packets at the
   6LoWPAN level receives a fragment with the Ack-Request flag set, it
   MUST send an RFRAG_ACK back to the originator to confirm reception of
   all the fragments it has received so far.

   The Ack-Request ('X') set in an RFRAG marks the end of a window.
   This flag MUST be set on the last fragment if the fragmenting
   endpoint wishes to perform an automatic repeat request (ARQ) process
   for the datagram, and it MAY be set in any intermediate fragment for
   the purpose of congestion control.

   This ARQ process MUST be protected by a Retransmission Time Out (RTO)
   timer, and the fragment that carries the 'X' flag MAY be retried upon
   a time out for a configurable number of times (see Section 7.1) with
   an exponential backoff.  Upon exhaustion of the retries the
   fragmenting endpoint may either abort the transmission of the
   datagram or resend the first fragment with an 'X' flag set in order
   to establish a new path for the datagram and obtain the list of
   fragments that were received over the old path in the acknowledgment
   bitmap.  When the knows that an underlying link-layer mechanism
   protects the fragments, it may refrain from using the RFRAG
   Acknowledgment mechanism, and never set the Ack-Request bit.

   The reassembling endpoint MAY issue unsolicited acknowledgments.  An
   unsolicited acknowledgment signals to the fragmenting endpoint that
   it can resume sending in case it has reached its maximum number of
   outstanding fragments.  Another use is to inform the fragmenting
   endpoint that the reassembling endpoint aborted the processing of an
   individual datagram.

   The RFRAG Acknowledgment carries an ECN indication for congestion
   control (see Appendix C).  The reassembling endpoint of a fragment
   with the 'E' (ECN) flag set MUST echo that information at most once
   by setting the 'E' (ECN) flag in the next RFRAG_ACK.

   In order to protect the datagram, the fragmenting endpoint transfers
   a controlled number of fragments and flags the last fragment of a
   window with an RFRAG Acknowledgment Request.  The reassembling
   endpoint MUST acknowledge a fragment with the acknowledgment request
   bit set.  If any fragment immediately preceding an acknowledgment
   request is still missing, the reassembling endpoint MAY intentionally
   delay its acknowledgment to allow in-transit fragments to arrive.

Thubert                 Expires 24 September 2020              [Page 13]
Internet-Draft               Selective RFRAG                  March 2020

   Because it might defeat the round-trip time computation, delaying the
   acknowledgment should be configurable and not enabled by default.

   When enough fragments are received to cover the whole datagram, the
   reassembling endpoint reconstructs the packet, passes it to the upper
   layer, sends an RFRAG_ACK on the reverse path with a FULL bitmap, and
   arms a short timer, e.g., on the order of an average round-trip time
   in the network.  The FULL bitmap is used as opposed to a bitmap that
   acknowledges only the received fragments to let the intermediate
   nodes know that the datagram is fully received.  As the timer runs,
   the reassembling endpoint absorbs the fragments that were still in
   flight for that datagram without creating a new state, acknowledging
   the ones that that bear an Ack-Request with an FRAG Acknowledgment
   and the FULL bitmap.  The reassembling endpoint aborts the
   communication if fragments with matching source and Datagram-Tag
   continue to be received after the timer expires.

   Note that acknowledgments might consume precious resources so the use
   of unsolicited acknowledgments SHOULD be configurable and not enabled
   by default.

   An observation is that streamlining forwarding of fragments generally
   reduces the latency over the LLN mesh, providing room for retries
   within existing upper-layer reliability mechanisms.  The fragmenting
   endpoint protects the transmission over the LLN mesh with a retry
   timer that is configured for a use case and may be adapted
   dynamically, e.g., according to the method detailed in [RFC6298].  It
   is expected that the upper layer retries obey the recommendations in
   [RFC8085], in which case a single round of fragment recovery should
   fit within the upper layer recovery timers.

   Fragments MUST be sent in a round-robin fashion: the sender MUST send
   all the fragments for a first time before it retries any lost
   fragment; lost fragments MUST be retried in sequence, oldest first.
   This mechanism enables the receiver to acknowledge fragments that
   were delayed in the network before they are retried.

   When a single radio frequency is used by contiguous hops, the
   fragmenting endpoint SHOULD insert a delay between the frames (e.g.,
   carrying fragments) that are sent to the same next hop.  The delay
   SHOULD cover multiple transmissions so as to let a frame progress a
   few hops and avoid hidden terminal issues.  This precaution is not
   required on channel hopping technologies such as Time Slotted Channel
   Hopping (TSCH) [RFC6554], where nodes that communicate at Layer-2 are
   scheduled to send and receive respectively, and different hops
   operate on different channels.

Thubert                 Expires 24 September 2020              [Page 14]
Internet-Draft               Selective RFRAG                  March 2020

6.1.  Forwarding Fragments

   This specification inherits from [FRAG-FWD] and proposes a Virtual
   Reassembly technique to forward fragments with no intermediate
   reconstruction of the entire datagram.

   The IPv6 Header MUST be placed in full in the first fragment to
   enable the routing decision.  The first fragment is routed and
   creates an LSP from the fragmenting endpoint to the reassembling
   endpoint.  The next fragments are label-switched along that LSP.  As
   a consequence, the next fragments can only follow the path that was
   set up by the first fragment and cannot follow an alternate route.
   The Datagram_Tag is used to carry the label, which is swapped in each

   If the first fragment is too large for the path MTU, it will
   repeatedly fail and never establish an LSP.  In that case, the
   fragmenting endpoint MAY retry the same datagram with a smaller
   Fragment_Size, in which case it MUST abort the original attempt and
   use a new Datagram_Tag for the new attempt.

6.1.1.  Receiving the first fragment

   In Route-Over mode, the source and destination Link-Layer addresses
   in a frame change at each hop.  The label that is formed and placed
   in the Datagram_Tag by the sender is associated with the source Link-
   Layer address and only valid (and temporarily unique) for that source
   Link-Layer address.

   Upon receiving the first fragment (i.e., with a Sequence of 0), an
   intermediate router creates a VRB and the associated LSP state
   indexed by the incoming interface, the previous-hop Link-Layer
   address, and the Datagram_Tag, and forwards the fragment along the
   IPv6 route that matches the destination IPv6 address in the IPv6
   header until it reaches the reassembling endpoint, as prescribed by
   [FRAG-FWD].  The LSP state enables to match the next incoming
   fragments of a datagram to the abstract forwarding information of
   next interface, source and next-hop Link-Layer addresses, and swapped

   In addition, the router also forms a reverse LSP state indexed by the
   interface to the next hop, the Link-Layer address the router uses as
   source for that datagram, and the swapped Datagram_Tag.  This reverse
   LSP state enables matching the tuple (interface, destination Link-
   Layer address, Datagram_Tag) found in an RFRAG_ACK to the abstract
   forwarding information (previous interface, previous Link-Layer
   address, Datagram_Tag) used to forward the RFRAG-ACK back to the
   fragmenting endpoint.

Thubert                 Expires 24 September 2020              [Page 15]
Internet-Draft               Selective RFRAG                  March 2020

6.1.2.  Receiving the next fragments

   Upon receiving the next fragment (i.e., with a non-zero Sequence), an
   intermediate router looks up a LSP indexed by the tuple (incoming
   interface, previous-hop Link-Layer address, Datagram_Tag) found in
   the fragment.  If it is found, the router forwards the fragment using
   the associated VRB as prescribed by [FRAG-FWD].

   If the VRB for the tuple is not found, the router builds an RFRAG-ACK
   to abort the transmission of the packet.  The resulting message has
   the following information:

   *  The source and destination Link-Layer addresses are swapped from
      those found in the fragment and the same interface is used
   *  The Datagram_Tag is set to the Datagram_Tag found in the fragment
   *  A NULL bitmap is used to signal the abort condition

   At this point the router is all set and can send the RFRAG-ACK back
   to the previous router.  The RFRAG-ACK should normally be forwarded
   all the way to the source using the reverse LSP state in the VRBs in
   the intermediate routers as described in the next section.

   [FRAG-FWD] indicates that the reassembling endpoint stores "the
   actual packet data from the fragments received so far, in a form that
   makes it possible to detect when the whole packet has been received
   and can be processed or forwarded".  How this is computed is
   implementation specific but relies on receiving all the bytes up to
   the Datagram_Size indicated in the first fragment.  An implementation
   may receive overlapping fragments as the result of retries after an
   MTU change.

6.2.  Receiving RFRAG Acknowledgments

   Upon receipt of an RFRAG-ACK, the router looks up a reverse LSP
   indexed by the interface and destination Link-Layer address of the
   received frame and the received Datagram_Tag in the RFRAG-ACK.  If it
   is found, the router forwards the fragment using the associated VRB
   as prescribed by [FRAG-FWD], but using the reverse LSP so that the
   RFRAG-ACK flows back to the fragmenting endpoint.

   If the reverse LSP is not found, the router MUST silently drop the
   RFRAG-ACK message.

   Either way, if the RFRAG-ACK indicates that the fragment was entirely
   received (FULL bitmap), it arms a short timer, and upon timeout, the
   VRB and all the associated state are destroyed.  Until the timer
   elapses, fragments of that datagram may still be received, e.g. if
   the RFRAG-ACK was lost on the path back and the source retried the

Thubert                 Expires 24 September 2020              [Page 16]
Internet-Draft               Selective RFRAG                  March 2020

   last fragment.  In that case, the router generates an RFRAG-ACK with
   a FULL bitmap back to the fragmenting endpoint if an acknowledgement
   was requested, else it silently drops the fragment.

   This specification does not provide a method to discover the number
   of hops or the minimal value of MTU along those hops.  In a typical
   case, the MTU is constant and the same across the network.  But
   should the minimal MTU along the path decrease, it is possible to
   retry a long fragment (say Sequence of 5) with several shorter
   fragments with a Sequence that was not used before (e.g., 13 and 14).
   Fragment 5 is marked as abandoned and will not be retried anymore.
   Note that when this mechanism is in place, it is hard to predict the
   total number of fragments that will be needed or the final shape of
   the bitmap that would cover the whole packet.  This is why the FULL
   bitmap is used when the reassembling endpoint gets the whole datagram
   regardless of which fragments were actually used to do so.
   Intermediate nodes will unabiguously know that the process is
   complete.  Note that Path MTU Discovery is out of scope for this

6.3.  Aborting the Transmission of a Fragmented Packet

   A reset is signaled on the forward path with a pseudo fragment that
   has the Fragment_Offset set to 0.  The sender of a reset SHOULD also
   set the Sequence and Fragment_Size field to 0.

   When the fragmenting endpoint or a router on the path decides that a
   packet should be dropped and the fragmentation process aborted, it
   generates a reset pseudo fragment and forwards it down the fragment

   Each router next along the path the way forwards the pseudo fragment
   based on the VRB state.  If an acknowledgment is not requested, the
   VRB and all associated state are destroyed.

   Upon reception of the pseudo fragment, the reassembling endpoint
   cleans up all resources for the packet associated with the
   Datagram_Tag.  If an acknowledgment is requested, the reassembling
   endpoint responds with a NULL bitmap.

   The other way around, the reassembling endpoint might need to abort
   the processing of a fragmented packet for internal reasons, for
   instance if it is out of reassembly buffers, already uses all 256
   possible values of the Datagram_Tag, or if it keeps receiving
   fragments beyond a reasonable time while it considers that this
   packet is already fully reassembled and was passed to the upper
   layer.  In that case, the reassembling endpoint SHOULD indicate so to
   the fragmenting endpoint with a NULL bitmap in an RFRAG_ACK.

Thubert                 Expires 24 September 2020              [Page 17]
Internet-Draft               Selective RFRAG                  March 2020

   The RFRAG_ACK is forwarded all the way back to the source of the
   packet and cleans up all resources on the path.  Upon an
   acknowledgment with a NULL bitmap, the fragmenting endpoint MUST
   abort the transmission of the fragmented datagram with one exception:
   In the particular case of the first fragment, it MAY decide to retry
   via an alternate next hop instead.

6.4.  Applying Recoverable Fragmentation along a Diverse Path

   The text above can be read with the assumption of a serial path
   between a source and a destination.  Section 4.5.3 of the "6TiSCH
   Architecture" [I-D.ietf-6tisch-architecture] defines the concept of a
   Track that can be a complex path between a source and a destination
   with Packet ARQ, Replication, Elimination and Overhearing (PAREO)
   along the Track.  This specification can be used along any subset of
   the complex Track where the first fragment is flooded.  The last
   RFRAG Acknowledgment is flooded on that same subset in the reverse
   direction.  Intermediate RFRAG Acknowledgments can be flooded on any
   sub-subset of that reverse subset that reach back to the source.

7.  Management Considerations

   This specification extends "On Forwarding 6LoWPAN Fragments over a
   Multihop IPv6 Network" [FRAG-FWD] and requires the same parameters in
   the reassembling endpoint and on intermediate nodes.  There is no new
   parameter as echoing ECN is always on.  These parameters typically
   include the reassembly timeout at the reassembling endpoint and an
   inactivity clean-up timer on the intermediate nodes, and the number
   of messages that can be processed in parallel in all nodes.

   The configuration settings introduced by this specification only
   apply to the fragmenting endpoint, which is in full control of the
   transmission.  LLNs vary a lot in size (there can be thousands of
   nodes in a mesh), in speed (from 10 Kbps to several Mbps at the PHY
   layer), in traffic density, and in optimizations that are desired
   (e.g., the selection of a RPL [RFC6550] Objective Function [RFC6552]
   impacts the shape of the routing graph).

   For that reason, only a very generic guidance can be given on the
   settings of the fragmenting endpoint and on whether complex
   algorithms are needed to perform congestion control or estimate the
   round-trip time.  To cover the most complex use cases, this
   specification enables the fragmenting endpoint to vary the fragment
   size, the window size, and the inter-frame gap, based on the number
   of losses, the observed variations of the round-trip time and the
   setting of the ECN bit.

Thubert                 Expires 24 September 2020              [Page 18]
Internet-Draft               Selective RFRAG                  March 2020

7.1.  Protocol Parameters

   The management system SHOULD be capable of providing the parameters
   listed in this section and an implementation MUST abide by those
   parameters and in particular never exceed the minimum and maximum
   configured boundaries.

   An implementation should consider the generic recommendations from
   the IETF in the matter of congestion control and rate management for
   IP datagrams in [RFC8085].  An implementation may perform a
   congestion control by using a dynamic value of the window size
   (Window_Size), adapting the fragment size (Fragment_Size), and may
   reduce the load by inserting an inter-frame gap that is longer than
   necessary.  In a large network where nodes contend for the bandwidth,
   a larger Fragment_Size consumes less bandwidth but also reduces
   fluidity and incurs higher chances of loss in transmission.

   This is controlled by the following parameters:

   inter-frame gap:  The inter-frame gap indicates the minimum amount of
      time between transmissions.  The inter-frame gap controls the rate
      at which fragments are sent, the ratio of air time, and the amount
      of memory in intermediate nodes that a particular datagram will
      use.  It can be used as a flow control, a congestion control, and/
      or a collision control measure.  It MUST be set at a minimum to a
      value that protects the propagation of one transmission against
      collision with next [FRAG-FWD].  In a wireless network that uses
      the same frequency along a path, this may represent the time for a
      frame to progress over multiple hops (more in Section 4.2).  It
      SHOULD be augmented beyond this as necessary to protect the
      network against congestion.

   MinFragmentSize:  The MinFragmentSize is the minimum value for the
      Fragment_Size.  It MUST be lower than the minimum value of
      smallest 1-hop MTU that can be encountered along the path.

   OptFragmentSize:  The OptFragmentSize is the value for the
      Fragment_Size that the fragmenting endpoint should use to start
      with.  It is greater than or equal to MinFragmentSize.  It is less
      than or equal to MaxFragmentSize.  For the first fragment, it must
      account for the expansion of the IPv6 addresses and of the Hop
      Limit field within MTU.  For all fragments, it is a balance
      between the expected fluidity and the overhead of Link-Layer and
      6LoWPAN headers.  For a small MTU, the idea is to keep it close to
      the maximum, whereas for larger MTUs, it might makes sense to keep
      it short enough, so that the duty cycle of the transmitter is
      bounded, e.g., to transmit at least 10 frames per second.

Thubert                 Expires 24 September 2020              [Page 19]
Internet-Draft               Selective RFRAG                  March 2020

   MaxFragmentSize:  The MaxFragmentSize is the maximum value for the
      Fragment_Size.  It MUST be lower than the maximum value of
      smallest 1-hop MTU that can be encountered along the path.  A
      large value augments the chances of buffer bloat and transmission
      loss.  The value MUST be less than 512 if the unit that is defined
      for the PHY layer is the byte.

   Window_Size:  The Window_Size MUST be at least 1 and less than 33.

      *  If the round-trip time is known, the Window_Size SHOULD be set
         to the round-trip time divided by the time per fragment, that
         is the time to transmit a fragment plus the inter-frame gap.


      *  Setting the window_size to 32 is to be understood as only the
         last Fragment is acknowledged in each round.  This is the
         RECOMMENDED value in a half-duplex LLN where the fragment
         acknowledgement consumes roughly the same bandwidth on the same
         links as the fragments themselves

      *  If it is set to a smaller value, more acks are generated.  In a
         full-duplex network, the load on the forward path will be
         lower, and a small value of 3 SHOULD be configured.

   An implementation may perform its estimate of the RTO or use a
   configured one.  The ARQ process is controlled by the following

   MinARQTimeOut:  The minimum amount of time a node should wait for an
      RFRAG Acknowledgment before it takes the next action.  It MUST be
      more than the maximum expected round-trip time in the respective

   OptARQTimeOut:  The initial value of the RTO, which is the amount of
      time that a fragmenting endpoint should wait for an RFRAG
      Acknowledgment before it takes the next action.  It is greater
      than or equal to MinARQTimeOut.  It is less than or equal to
      MaxARQTimeOut.  See Appendix C for recommendations on computing
      the round-trip time.  By default a value of 3 times the maximum
      expected round-trip time in the respective network is RECOMMENDED.

   MaxARQTimeOut:  The maximum amount of time a node should wait for the
      RFRAG Acknowledgment before it takes the next action.  It must
      cover the longest expected round-trip time, and be several times
      less than the timeout that covers the recomposition buffer at the
      reassembling endpoint, which is typically on the order of the
      minute.  An upper bound can be estimated to ensure that the

Thubert                 Expires 24 September 2020              [Page 20]
Internet-Draft               Selective RFRAG                  March 2020

      datagram is either fully transmitted or dropped before an upper
      layer decides to retry it.

   MaxFragRetries:  The maximum number of retries for a particular
      fragment.  A default value of 3 is RECOMMENDED.  An upper bound
      can be estimated to ensure that the datagram is either fully
      transmitted or dropped before an upper layer decides to retry it.

   MaxDatagramRetries:  The maximum number of retries from scratch for a
      particular datagram.  A default value of 1 is RECOMMENDED.  An
      upper bound can be estimated to ensure that the datagram is either
      fully transmitted or dropped before an upper layer decides to
      retry it.

   An implementation may be capable of performing congestion control
   based on ECN; see in Appendix C.  This is controlled by the following

   UseECN:  Indicates whether the fragmenting endpoint should react to
      ECN.  The fragmenting endpoint may react to ECN by varying the
      Window_Size between MinWindowSize and MaxWindowSize, varying the
      Fragment_Size between MinFragmentSize and MaxFragmentSize, and/or
      by increasing or reducing the inter-frame gap.  With this
      specification, if UseECN is set and a fragmenting endpoint detects
      a congestion, it may apply a congestion control method until the
      end of the datagram, whereas if UseECN is reset, the endpoint does
      not react to congestion.  Future specifications may provide
      additional parameters and capabilities.

7.2.  Observing the network

   The management system should monitor the number of retries and of ECN
   settings that can be observed from the perspective of both the
   fragmenting endpoint and the reassembling endpoint with regards to
   the other endpoint.  It may then tune the optimum size of
   Fragment_Size and of Window_Size, OptFragmentSize, and OptWindowSize,
   respectively, at the fragmenting endpoint towards a particular
   reassembling endpoint, applicable to the next datagrams.  It will
   preferably tune the inter-frame gap to increase the spacing between
   fragments of the same datagram and reduce the buffer bloat in
   intermediate node that holds one or more fragments of that datagram.

8.  Security Considerations

   This document specifies an instantiation of a 6FF technique and
   inherits from the generic description in [FRAG-FWD].  The
   considerations in the Security Section of [FRAG-FWD] equally apply to
   this document.

Thubert                 Expires 24 September 2020              [Page 21]
Internet-Draft               Selective RFRAG                  March 2020

   In addition to the threats detailed therein, an attacker that is on-
   path can prematurely end the transmission of a datagram by sending a
   RFRAG Acknowledgment to the fragmenting endpoint.  It can also cause
   extra transmissions of fragments by resetting bits in the RFRAG
   Acknowledgment bitmap, and of RFRAG Acknowledgments by forcing the
   Ack-Request bit in fragments that it forwards.

   As indicated in [FRAG-FWD], Secure joining and the Link-Layer
   security are REQUIRED to protect against those attacks, as the
   fragmentation protocol does not include any native security

   This specification does not recommend a particular algorithm for the
   estimation of the duration of the RTO that covers the detection of
   the loss of a fragment with the 'X' flag set; regardless, an attacker
   on the path may slow down or discard packets, which in turn can
   affect the throughput of fragmented packets.

   Compared to "Transmission of IPv6 Packets over IEEE 802.15.4
   Networks" [RFC4944], this specification reduces the Datagram_Tag to 8
   bits and the tag wraps faster than with [RFC4944].  But for a
   constrained network where a node is expected to be able to hold only
   one or a few large packets in memory, 256 is still a large number.
   Also, the acknowledgement mechanism allows cleaning up the state
   rapidly once the packet is fully transmitted or aborted.

   The abstract Virtual Recovery Buffer inherited from [FRAG-FWD] may be
   used to perform a Denial-of-Service (DoS) attack against the
   intermediate Routers since the routers need to maintain a state per
   flow.  The particular VRB implementation technique described in
   [LWIG-FRAG] allows realigning which data goes in which fragment,
   which causes the intermediate node to store a portion of the data,
   which adds an attack vector that is not present with this
   specification.  With this specification, the data that is transported
   in each fragment is conserved and the state to keep does not include
   any data that would not fit in the previous fragment.

9.  IANA Considerations

   This document allocates 2 patterns for a total of 4 dispatch values
   in Page 0 for recoverable fragments from the "Dispatch Type Field"
   registry that was created by "Transmission of IPv6 Packets over IEEE
   802.15.4 Networks" [RFC4944] and reformatted by "6LoWPAN Paging
   Dispatch" [RFC8025].

   The suggested patterns (to be confirmed by IANA) are indicated in
   Table 1.

Thubert                 Expires 24 September 2020              [Page 22]
Internet-Draft               Selective RFRAG                  March 2020

   | Bit Pattern | Page |           Header Type            | Reference |
   | 11 10100x   | 0    | RFRAG - Recoverable Fragment     | THIS RFC  |
   | 11 10100x   | 1-14 | Unassigned                       |           |
   | 11 10100x   | 15   | Reserved for Experimental Use    | RFC 8025  |
   | 11 10101x   | 0    | RFRAG-ACK - RFRAG                | THIS RFC  |
   |             |      | Acknowledgment                   |           |
   | 11 10101x   | 1-14 | Unassigned                       |           |
   | 11 10101x   | 15   | Reserved for Experimental Use    | RFC 8025  |

              Table 1: Additional Dispatch Value Bit Patterns

10.  Acknowledgments

   The author wishes to thank Michel Veillette, Dario Tedeschi, Laurent
   Toutain, Carles Gomez Montenegro, Thomas Watteyne, and Michael
   Richardson for in-depth reviews and comments.  Also many thanks to
   Roman Danyliw, Peter Yee, Colin Perkins, Tirumaleswar Reddy Konda,
   Eric Vyncke, Warren Kumari, Magnus Westerlund, Erik Nordmark, and
   especially Benjamin Kaduk and Mirja Kuhlewind for their careful
   reviews and for helping through the IETF Last Call and IESG review
   process, and to Jonathan Hui, Jay Werb, Christos Polyzois, Soumitri
   Kolavennu, Pat Kinney, Margaret Wasserman, Richard Kelsey, Carsten
   Bormann, and Harry Courtice for their various contributions in the
   long process that lead to this document.

11.  Normative References

   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
              "Computing TCP's Retransmission Timer", RFC 6298,
              DOI 10.17487/RFC6298, June 2011,

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,

   [RFC4944]  Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler,
              "Transmission of IPv6 Packets over IEEE 802.15.4

Thubert                 Expires 24 September 2020              [Page 23]
Internet-Draft               Selective RFRAG                  March 2020

              Networks", RFC 4944, DOI 10.17487/RFC4944, September 2007,

   [RFC4919]  Kushalnagar, N., Montenegro, G., and C. Schumacher, "IPv6
              over Low-Power Wireless Personal Area Networks (6LoWPANs):
              Overview, Assumptions, Problem Statement, and Goals",
              RFC 4919, DOI 10.17487/RFC4919, August 2007,

   [RFC6282]  Hui, J., Ed. and P. Thubert, "Compression Format for IPv6
              Datagrams over IEEE 802.15.4-Based Networks", RFC 6282,
              DOI 10.17487/RFC6282, September 2011,

   [RFC6606]  Kim, E., Kaspar, D., Gomez, C., and C. Bormann, "Problem
              Statement and Requirements for IPv6 over Low-Power
              Wireless Personal Area Network (6LoWPAN) Routing",
              RFC 6606, DOI 10.17487/RFC6606, May 2012,

   [RFC8025]  Thubert, P., Ed. and R. Cragie, "IPv6 over Low-Power
              Wireless Personal Area Network (6LoWPAN) Paging Dispatch",
              RFC 8025, DOI 10.17487/RFC8025, November 2016,

   [RFC8138]  Thubert, P., Ed., Bormann, C., Toutain, L., and R. Cragie,
              "IPv6 over Low-Power Wireless Personal Area Network
              (6LoWPAN) Routing Header", RFC 8138, DOI 10.17487/RFC8138,
              April 2017, <>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <>.

   [RFC8200]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
              (IPv6) Specification", STD 86, RFC 8200,
              DOI 10.17487/RFC8200, July 2017,

   [FRAG-FWD] Watteyne, T., Thubert, P., and C. Bormann, "On Forwarding
              6LoWPAN Fragments over a Multihop IPv6 Network", Work in
              Progress, Internet-Draft, draft-ietf-6lo-minimal-fragment-
              14, 9 March 2020, <

12.  Informative References

Thubert                 Expires 24 September 2020              [Page 24]
Internet-Draft               Selective RFRAG                  March 2020

   [RFC8201]  McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed.,
              "Path MTU Discovery for IP version 6", STD 87, RFC 8201,
              DOI 10.17487/RFC8201, July 2017,

   [RFC7567]  Baker, F., Ed. and G. Fairhurst, Ed., "IETF
              Recommendations Regarding Active Queue Management",
              BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,

   [RFC3031]  Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol
              Label Switching Architecture", RFC 3031,
              DOI 10.17487/RFC3031, January 2001,

   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,

   [RFC2914]  Floyd, S., "Congestion Control Principles", BCP 41,
              RFC 2914, DOI 10.17487/RFC2914, September 2000,

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP",
              RFC 3168, DOI 10.17487/RFC3168, September 2001,

   [RFC4963]  Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
              Errors at High Data Rates", RFC 4963,
              DOI 10.17487/RFC4963, July 2007,

   [RFC5033]  Floyd, S. and M. Allman, "Specifying New Congestion
              Control Algorithms", BCP 133, RFC 5033,
              DOI 10.17487/RFC5033, August 2007,

   [RFC6550]  Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J.,
              Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur,
              JP., and R. Alexander, "RPL: IPv6 Routing Protocol for
              Low-Power and Lossy Networks", RFC 6550,
              DOI 10.17487/RFC6550, March 2012,

   [RFC6552]  Thubert, P., Ed., "Objective Function Zero for the Routing
              Protocol for Low-Power and Lossy Networks (RPL)",

Thubert                 Expires 24 September 2020              [Page 25]
Internet-Draft               Selective RFRAG                  March 2020

              RFC 6552, DOI 10.17487/RFC6552, March 2012,

   [RFC6554]  Hui, J., Vasseur, JP., Culler, D., and V. Manral, "An IPv6
              Routing Header for Source Routes with the Routing Protocol
              for Low-Power and Lossy Networks (RPL)", RFC 6554,
              DOI 10.17487/RFC6554, March 2012,

   [RFC7554]  Watteyne, T., Ed., Palattella, M., and L. Grieco, "Using
              IEEE 802.15.4e Time-Slotted Channel Hopping (TSCH) in the
              Internet of Things (IoT): Problem Statement", RFC 7554,
              DOI 10.17487/RFC7554, May 2015,

   [RFC8085]  Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
              Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
              March 2017, <>.

   [RFC8087]  Fairhurst, G. and M. Welzl, "The Benefits of Using
              Explicit Congestion Notification (ECN)", RFC 8087,
              DOI 10.17487/RFC8087, March 2017,

              Bormann, C. and T. Watteyne, "Virtual reassembly buffers
              in 6LoWPAN", Work in Progress, Internet-Draft, draft-ietf-
              lwig-6lowpan-virtual-reassembly-02, 9 March 2020,

   [FRAG-ILE] Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O.,
              and F. Gont, "IP Fragmentation Considered Fragile", Work
              in Progress, Internet-Draft, draft-ietf-intarea-frag-
              fragile-17, 30 September 2019,

              Thubert, P., "An Architecture for IPv6 over the TSCH mode
              of IEEE 802.15.4", Work in Progress, Internet-Draft,
              draft-ietf-6tisch-architecture-28, 29 October 2019,

              IEEE, "IEEE Standard for Low-Rate Wireless Networks",
              IEEE Standard 802.15.4, DOI 10.1109/IEEE

Thubert                 Expires 24 September 2020              [Page 26]
Internet-Draft               Selective RFRAG                  March 2020


   [Kent]     Kent, C. and J. Mogul, ""Fragmentation Considered
              Harmful", In Proc. SIGCOMM '87 Workshop on Frontiers in
              Computer Communications Technology",
              DOI 10.1145/55483.55524, August 1987,

Appendix A.  Rationale

   There are a number of uses for large packets in Wireless Sensor
   Networks.  Such usages may not be the most typical or represent the
   largest amount of traffic over the LLN; however, the associated
   functionality can be critical enough to justify extra care for
   ensuring effective transport of large packets across the LLN.

   The list of those usages includes:

   Towards the LLN node:  Firmware update:  For example, a new version
         of the LLN node software is downloaded from a system manager
         over unicast or multicast services.  Such a reflashing
         operation typically involves updating a large number of similar
         LLN nodes over a relatively short period of time.

                          Packages of Commands:  A number of commands or
         a full configuration can be packaged as a single message to
         ensure consistency and enable atomic execution or complete roll
         back.  Until such commands are fully received and interpreted,
         the intended operation will not take effect.

   From the LLN node:  Waveform captures:  A number of consecutive
         samples are measured at a high rate for a short time and then
         transferred from a sensor to a gateway or an edge server as a
         single large report.

                       Data logs:  LLN nodes may generate large logs of
         sampled data for later extraction.  LLN nodes may also generate
         system logs to assist in diagnosing problems on the node or

                       Large data packets:  Rich data types might
         require more than one fragment.

   Uncontrolled firmware download or waveform upload can easily result
   in a massive increase of the traffic and saturate the network.

Thubert                 Expires 24 September 2020              [Page 27]
Internet-Draft               Selective RFRAG                  March 2020

   When a fragment is lost in transmission, the lack of recovery in the
   original fragmentation system of RFC 4944 implies that all fragments
   would need to be resent, further contributing to the congestion that
   caused the initial loss, and potentially leading to congestion

   This saturation may lead to excessive radio interference, or random
   early discard (leaky bucket) in relaying nodes.  Additional queuing
   and memory congestion may result while waiting for a low power next
   hop to emerge from its sleeping state.

   Considering that RFC 4944 defines an MTU is 1280 bytes and that in
   most incarnations (except 802.15.4g) a IEEE Std. 802.15.4 frame can
   limit the Link-Layer payload to as few as 74 bytes, a packet might be
   fragmented into at least 18 fragments at the 6LoWPAN shim layer.
   Taking into account the worst-case header overhead for 6LoWPAN
   Fragmentation and Mesh Addressing headers will increase the number of
   required fragments to around 32.  This level of fragmentation is much
   higher than that traditionally experienced over the Internet with
   IPv4 fragments.  At the same time, the use of radios increases the
   probability of transmission loss and Mesh-Under techniques compound
   that risk over multiple hops.

   Mechanisms such as TCP or application-layer segmentation could be
   used to support end-to-end reliable transport.  One option to support
   bulk data transfer over a frame-size-constrained LLN is to set the
   Maximum Segment Size to fit within the link maximum frame size.
   Doing so, however, can add significant header overhead to each
   802.15.4 frame and cause extraneous acknowledgements across the LLN
   compared to the method in this specification.

Appendix B.  Requirements

   For one-hop communications, a number of Low Power and Lossy Network
   (LLN) link-layers propose a local acknowledgment mechanism that is
   enough to detect and recover the loss of fragments.  In a multihop
   environment, an end-to-end fragment recovery mechanism might be a
   good complement to a hop-by-hop MAC recovery.  This draft introduces
   a simple protocol to recover individual fragments between 6FF
   endpoints that may be multiple hops away.

   The method addresses the following requirements of an LLN:

   Number of fragments:  The recovery mechanism must support highly
      fragmented packets, with a maximum of 32 fragments per packet.

   Minimum acknowledgment overhead:  Because the radio is half duplex,
      and because of silent time spent in the various medium access

Thubert                 Expires 24 September 2020              [Page 28]
Internet-Draft               Selective RFRAG                  March 2020

      mechanisms, an acknowledgment consumes roughly as many resources
      as a data fragment.

      The new end-to-end fragment recovery mechanism should be able to
      acknowledge multiple fragments in a single message and not require
      an acknowledgment at all if fragments are already protected at a
      lower layer.

   Controlled latency:  The recovery mechanism must succeed or give up
      within the time boundary imposed by the recovery process of the
      Upper Layer Protocols.

   Optional congestion control:  The aggregation of multiple concurrent
      flows may lead to the saturation of the radio network and
      congestion collapse.

      The recovery mechanism should provide means for controlling the
      number of fragments in transit over the LLN.

Appendix C.  Considerations on Congestion Control

   Considering that a multi-hop LLN can be a very sensitive environment
   due to the limited queuing capabilities of a large population of its
   nodes, this draft recommends a simple and conservative approach to
   Congestion Control, based on TCP congestion avoidance.

   Congestion on the forward path is assumed in case of packet loss, and
   packet loss is assumed upon time out.  The draft allows controlling
   the number of outstanding fragments that have been transmitted but
   for which an acknowledgment was not received yet and are still
   covered by the ARQ timer.

   Congestion on the forward path can also be indicated by an Explicit
   Congestion Notification (ECN) mechanism.  Though whether and how ECN
   [RFC3168] is carried out over the LoWPAN is out of scope, this draft
   provides a way for the destination endpoint to echo an ECN indication
   back to the fragmenting endpoint in an acknowledgment message as
   represented in Figure 4 in Section 5.2.

   While the support of echoing the ECN at the reassembling endpoint is
   mandatory, this specification only provides a minimalistic behaviour
   on the fragmenting endpoint.  If an E flag is received the window
   SHOULD be reduced, at least by 1 and at max to 1.  Halving the window
   for each E flag received, could be a good compromise but needs
   further experimentation.  A very simple implementation may just reset
   the window to 1 so the fragments are sent and acknowledged one by

Thubert                 Expires 24 September 2020              [Page 29]
Internet-Draft               Selective RFRAG                  March 2020

   Note that any action that has been performed upon detection of
   congestion only applies for the transmission of one datagram and the
   next datagram starts with the configured Window_Size again.

   The exact use of the Acknowledgement Request flag and of the window
   are left to implementation.  An optimistic implementation could send
   all the fragments up to Window_Size, setting the Acknowledgement
   Request 'X' flag only on the last fragment, wait for the bitmap,
   which means a gap of half a round-trip time, and resend the losses.
   A pessimistic implementation could set the 'X' flag on the first
   fragment to check that the path works and open the window only upon
   receiving the RFRAG_ACK.  It could then set an 'X' flag again on the
   second fragment and then use the window as a credit to send up to
   Window_Size before it is blocked.  In that case, if the RFRAG_ACK
   comes back before the window starves, the gating factor is the inter-
   frame gap.  If the RFRAG_ACK does not arrive in time, the Window_Size
   is the gating factor and the transmission of the datagram is delayed.

   It must be noted that though the inter-frame gap can be used as a
   flow control or a congestion control measure, it also plays a
   critical role in wireless collision avoidance.  In particular, when a
   mesh operates on the same channel over multiple hops, then the
   forwarding of a fragment over a certain hop may collide with the
   forwarding of the next fragment that is following over a previous hop
   but in the same interference domain.  To prevent this, the
   fragmenting endpoint is required to pace individual fragments within
   a transmit window with an inter-frame gap.  This is needed to ensure
   that a given fragment is sent only when the previous fragment has had
   a chance to progress beyond the interference domain of this hop.  In
   the case of 6TiSCH [I-D.ietf-6tisch-architecture], which operates
   over the TimeSlotted Channel Hopping [RFC7554] (TSCH) mode of
   operation of IEEE802.14.5, a fragment is forwarded over a different
   channel at a different time and it makes full sense to transmit the
   next fragment as soon as the previous fragment has had its chance to
   be forwarded at the next hop.

   Depending on the setting of the Window_Size and the inter-frame gap,
   on how the window is used and on the number of hops, the Window_Size
   may or may not become the gating factor that blocks the transmission.
   If the sender uses the Window_Size as a credit:

   *  a conservative Window_Size of, say, 3, will be the gating factor
      that limits the transmission rate of the sender - and causes
      transmission gaps longer than inter-frame gap - as soon as the
      number of hops exceeds 3 in a TSCH network and 5-9 in a single
      frequency mesh.  The more hops the more the starving window will
      add to latency of the transmission.

Thubert                 Expires 24 September 2020              [Page 30]
Internet-Draft               Selective RFRAG                  March 2020

   *  The recommendation to align the Window-Size to the round-trip time
      divided by the time per fragment aligns the Window-Size to the
      time it takes to get the RFAG_ACK before the window starves.  A
      Window-Size that is higher than that increases the chances of a
      congestion but does not improve the forward throughput.
      Considering that the RFRAG_ACK takes the same path as the fragment
      and with the assumption that it travels at roughly the same speed,
      an inter-frame gap that separates fragments by 2 hops leads to a
      Window_Size that is roughly the number of hops.

   *  Setting the Window-Size to 32 minimizes the cost of the
      Acknowledgement in a constrained network, and frees bandwidth for
      the fragments in a half-duplex network.  Using it increases the
      risk of congestion if a bottleneck forms, but optimizes the use of
      resources under normal conditions.  When it is used, the only
      protection for the network is the inter-frame gap, which must be
      chosen wisely to prevent the formation of a bottleneck.

   From the standpoint of a source 6LoWPAN endpoint, an outstanding
   fragment is a fragment that was sent but for which no explicit
   acknowledgment was received yet.  This means that the fragment might
   be on the path, received but not yet acknowledged, or the
   acknowledgment might be on the path back.  It is also possible that
   either the fragment or the acknowledgment was lost on the way.

   From the fragmenting endpoint standpoint, all outstanding fragments
   might still be in the network and contribute to its congestion.
   There is an assumption, though, that after a certain amount of time,
   a frame is either received or lost, so it is not causing congestion
   anymore.  This amount of time can be estimated based on the round-
   trip time between the 6LoWPAN endpoints.  For the lack of a more
   adapted technique, the method detailed in "Computing TCP's
   Retransmission Timer" [RFC6298] may be used for that computation.

   This specification provides the necessary tools for the fragmenting
   endpoint to take congestion control actions and protect the network,
   but leaves the implementation free to select the action to be taken.
   The intention is to use it to build experience and specify more
   precisely the congestion control actions in one or more future
   specifications.  "Congestion Control Principles" [RFC2914] and
   "Specifying New Congestion Control Algorithms" [RFC5033] provide
   indications and wisdom that should help through this process.

   [RFC7567] and [RFC5681] provide deeper information on why Congestion
   Control is needed and how TCP handles it.  Basically, the goal here
   is to manage the number of fragments present in the network; this is
   achieved by to reducing the number of outstanding fragments over a
   congested path by throttling the sources.

Thubert                 Expires 24 September 2020              [Page 31]
Internet-Draft               Selective RFRAG                  March 2020

Author's Address

   Pascal Thubert (editor)
   Cisco Systems, Inc
   Building D
   45 Allee des Ormes - BP1200
   06254 MOUGINS - Sophia Antipolis

   Phone: +33 497 23 26 34

Thubert                 Expires 24 September 2020              [Page 32]