Javascript disabled? Like other modern websites, the IETF Datatracker relies on Javascript. Please enable Javascript for full functionality.
Shepherd writeup
draft-clausen-lln-rpl-experiences

ISE write-up for: draft-clausen-lln-rpl-experiences-11

Abstract:
  "With RPL - the "IPv6 Routing Protocol for Low-power Lossy Networks" -
   published as a Proposed Standard after a ~2-year development cycle,
   this document presents an observation of the resulting protocol, of
   its applicability, and of its limits.  The documents presents a
   selection of observations on the protocol characteristics, exposes
   experiences acquired when producing various prototype implementations
   of RPL, and presents results obtained from testing this protocol - by
   way of network simulations, in network testbeds and in deployments.
   The document aims at providing a better understanding of possible
   limits of RPL, notably the possible directions that further protocol
   developments should explore, in order to address these."

An early version of this draft was developed in the ROLL WG back in
about 2014.  Michael Richardson (ROLL co-chair at the time) commented
that there was "no support for draft because it gave "insuffucuent
detail about parameters, so other researchers couldn't validate its claims."
Two reviews from ROLL WG members (at that time) are appended below.

The draft's authors have responded to those reviews, and submitted
a new (-10) version of the draft to the ISE on 24 Jan 2019.

It was reviewed for me by Ines Robles and Stephen Farrel, their reviews
are appended immediately below.  The authors have worked through the
issues raised by these reviewers.

I have asked the current ROLL co-chairs (Michael Richardson and
Peter van der Stok) whether the ROLL WG has any objection to this
draft being published in the Independent Stream, they say "it's
"OK if the ISE feels it provides some useful content".

Since the draft describes experience with using RPL, I (Nevil) believe it
does indeed provide some useful content.

This draft has no requests for IANA.

- - - - - - -

Ines Robles' review:

I have been selected as a reviewer for this draft.

Document: draft-clausen-lln-rpl-experiences-10
Reviewer: Ines Robles
Review Date: 03-03-2018
Intended status: Informational

Summary:
I have some minor concerns about this document that I think should be resolved
before publication.

Comments:

I believe the draft is technically good. This document is well written and
clear to understand.

Major Issues:

No major issues found.

Minor Issues:

The sections of the document that refers to the charter are out-of-date, for
example:

- In Section 1:

   - The objective of ... ROLL => I would add "at the moment of writing this
   document is..."

   - "comprise up to thousands of nodes" -> it is not present in the current
   charter. If you want to use this anyway, please mentions in the text
   explicitly, that refers to the charter of 2012, to avoid confusion with the
   current charter.

   -The same for the paragraph that begin with
   [roll-charter] states that "Typical traffic patterns are not simply
   unicast flows.... => it is not in the current charter.

- Section 3:

  "Downward routes" are enabled by having sensors issue => instead of sensors,
  I would add nodes, what do you think?

- Section 6:

 - You mention RFC 2460, it was obsoleted by RFC 8200

Some Nits:

- In Section 3:

   - In an LLN => In a LLN
   - characteristics etc. routers in a... => characteristics, etc. Routers in
   a...

- In section 4.1:
  - Aaggregation => Aggregation

  - In section 5.1:
    - DADAGs => DODAGs

Thanks for this document,

Best Regards,

Ines.

- - - - - - -

Stephen Farrel's review:

 I've no idea if the roll wg would be happy with this
being an ISE RFC or not. If not, then I think going with
the WG opinion, and not publishing, would be correct.
I didn't see any evidence of this being discussed in a
very quick check of the WG mailing list archive.

- I think it would be good if IETF WGs did process things
like this, that critique previous work, but it'd be up to
them.

- My other comments below assume that the WG have explicitly
indicated that they're ok with it being done via the ISE.
(I think it'd be a very good thing to try find that out
before deciding to publish or not.)

- I had a quick look at the diff from -00 to -10,
which showed less difference than I expected. I'm not
sure if that indicates that some of the work on which
this is based is OBE or not. (-00 being from 2012.)
Many of the refs do look somewhat old as well with
the most recent seeming to be from 2014. I think the
authors need to say if all these observations are
still accurate given the time elapsed. If they can't
be convincing about that, then I don't see value in
this being published as an RFC of any kind. If they
can be convincing about that, then doing that in
the document would be a good plan.

- It's just sad that they didn't even think about the
even more sad state of RPL security - I expected at least
an observation that pretty much nothing much has been done
in that space, other than punting to receding future.
I think at least ack'ing that sad state ought be done
before publishing. I'm surprised that nobody seems to
have considered ways to attack RPL when doing work on
the various demonstrators mentioned (but I didn't check
the refs to see if that was done or not).

- The RPL overview is good - I wish 6550 had been that
terse!

- The observations seem reasonable to me, but I've no
idea if they're fair or not, as I've not been involved
in using RPL. That's a reason to suggest that the WG
liking this being sent to the ISE is worth checking,
before you decide to publish or not.

- If the observations are likely considered unfair
by the WG, or significant participants therein, then
that really ought be stated in the document.

- In either case, (fair or not) I think the draft
should be explicit about why specifically it's being
sent to the ISE. IMO "no interest" isn't in itself
a good reason, and I do think ISE RFCs ought say why
they exist in general. (But you/Adrian may well not
agree with that, and that's fine:-)

- last para if section 3.1: should "below" be "above"
in 'At the time t, if c is below some "redundancy
threshold", then it transmits its DIO' - given 'c' is
described as being incremented, I don't understand it
as-is.

- Acks: Ralph is no longer with Cisco afaik.

I'm fine with these comments being public, in whatever
manner you prefer.

Cheers,
S.

= = = = = = =

Earlier reviews (from ROLL WG emails) ...

From Jari Arkko:

This has taken a long time, but I have reviewed the first part of the document
now, until the beginning of Section 8. You should get the second part soon.
Here are my comments:

Note that this is early feedback. You will get more, including an overall
evaluation. For now, I think you are raising the correct issues, but the
description in the documents leaves at times something to be desired, perhaps
the main thing being that you focus only on problems. The message would be
stronger with some statement about areas where there are no scaling problems,
for instance.

Abstract: s/weaknesses and limits/limits/ (I would just say limits; it is less
controversial language; apply throughout)

A general style issue is that the document says "X does not work", whereas I
think it would be more constructive to say "Y works, but X does not". I'd scan
the document for these cases and reformulate the text a bit.

As an example,

With approximately one year past since publication of RPL as
[RFC6550  <http://tools.ietf.org/html/rfc6550
<http://tools.ietf.org/html/rfc6550>>], it is opportune to document
observations of the protocol, in order to understand which aspects of it
necessitate further investigations, and in order to identify possibly weak
points which may restrict the deployment scope of the protocol.

I'd reword this in the following way (or words to that effect):

   With approximately one year past since publication of RPL as
   [RFC6550], it is opportune to document observations of the protocol,
   in order to understand which aspects of it work well and which necessitate
   further investigations. Understanding possible limitations is important to
   identify issues which may restrict the deployment scope of the protocol and
   which may need further protocol work or enhancements.

Thus, although in principle RPL provides, by way of "Floating
DODAGs", protocol mechanisms for establishing a DODAG for providing
internal connectivity even in case of failure of the administratively
provisioned DODAG Root - especially in non-storing mode - it is
unlikely that any RPL Routers not explicitly provisioned as DODAG
Roots will have sufficient resources to undertake this task.

I'd reformulate this as as "to support floating DODAGs, all (or at least a
large number) of the RPL routers need to have resources to act as roots".

Another possible LLN scenario is that only internal point-to-point
connectivity is sought, and no RPL Router has a more "central" role
than any other - a self-organizing LLN.  Requiring special
provisioning of a specific "super-device" as DODAG Root is both
unnecessary and undesirable.

I'd take a different approach in this paragraph. I don't think we should label
something as "unnecessary and undesirable". We should instead either document
the associated overhead (or get rid of the paragraph).

    o  There are scenarios, where all traffic is bi-directional, e.g., in
       case sensor devices in the LLN are, in majority, "actively read":
       a request is issued by the DODAG Root to a specific sensor, and
       the sensor value is expected returned.  In fact, unless all
       traffic in the LLN is unidirectional, without acknowledgements
       (e.g., as in UDP), and no control messages (e.g., for service
       discovery) or other data packets are sent from the DODAG Root to
       the RPL Routers, traffic will be bi-directional.  As an example,
       the ZigBee Alliance SEP 2.0 specification [SEP2.0 
       <http://tools.ietf.org/html/draft-clausen-lln-rpl-experiences-05#ref-SEP2.0
       <http://tools.ietf.org/html/draft-clausen-lln-rpl-experiences-05#ref-SEP2.0>>]
       describes the use of HTTP over TCP over ZigBeeIP, between RPL Routers
       and the DODAG Root - and with the use of TCP inherently causing
       bidirectional traffic by way of data-packets and their corresponding
       acknowledgements.

This is good stuff. Could you add CoAP into this mix, and say that even a
current IETF protocol dedicated to the purpose typically uses acknowledgements,
to control packet loss and ensure that the message got through?

In fact, you could probably sharpen the argument a bit. Current Internet
protocols generally requirement some form of acknowledgment, and foregoing an
acknowledgment probably means a trade-off in the area of reliable transmission
or repeated retransmissions or both.

   For the latter, as there is no provision for on-demand generation of
    routing information from the DODAG Root to a proper subset of all RPL
    Routers, each RPL Router (besides the Root) is required to generate
    DAOs.  In particular in non-storing mode, each RPL Router will
    unicast a DAO to the DODAG Root (whereas in storing mode, the DAOs
    propagate upwards towards the Root).  The effects of the requirement
    to establish downward routes to all RPL Routers are:

    o  Increased memory and processing requirements at the DODAG Root (in
       particular in non-storing mode) and in RPL Routers near the DODAG
       Root (in storing mode).

    o  A considerable control traffic overhead [bidir 
    <http://tools.ietf.org/html/draft-clausen-lln-rpl-experiences-05#ref-bidir
    <http://tools.ietf.org/html/draft-clausen-lln-rpl-experiences-05#ref-bidir>>],
    in particular at
       and near the DODAG Root, therefore:

    o  Potentially congested channels, and:

    o  Energy drain from the RPL Routers.

Could you formulate this in "works well - does not work well" -style?

    In RPL, DIO messages consist of a mandatory base object, facilitating
    DODAG formation, and additional options for e.g., autoconfiguration
    and network management.  The base object contains two unused octets,
    reserved for future use, resulting in two bytes of unnecessary zeros,
    sent with each DIO message.  The Prefix Information option, used for
    automatic configuration of address, carries even four unused octets
    in order to be compatible with IPv6 neighbor discovery.

While true, I think bringing this up may dilute the power of your main
arguments later in the same Section. I'd probably remove, or at least condense
or use more matter-of-fact style.

RPL may further increase the probability of link-layer fragmentation
of data traffic: for non-storing mode, RPL employs source-routing for
all downward traffic.  [RFC6554  <http://tools.ietf.org/html/rfc6554
<http://tools.ietf.org/html/rfc6554>>] specifies the RPL Source Routing header,
which imposes a fixed overhead of 8 octets per IP packet leaving 71 octets
remaining from the link-layer MTU in order to contain the whole IP packet into
a single frame - from which must be deducted a variable number of octets,
depending on the length of the route.  With fewer octets available for data
payload, RPL thus increases the probability for link-layer fragmentation of
also data packets.  This, in particular, for longer routes, e.g., for point-to-
point data traffic between sensors inside the LLN, where data traffic transit
through the DODAG Root and is then source-routed to the destination.

You could make the argument sharper here again. Could you calculate a typical
overhead for some network N-routers wide?

Given the minimal packet size of LLNs, the routing protocol must
impose low (or no) overhead on data packets,

... should ideally ... (apply elsewhere in the doc too)

consisting of thousands of routers [roll-charter 
<http://tools.ietf.org/html/draft-clausen-lln-rpl-experiences-05#ref-roll-charter
<http://tools.ietf.org/html/draft-clausen-lln-rpl-experiences-05#ref-roll-charter>>],
the storing capacity on these RPL Routers may not be sufficient - or, at least,
the storage requirements in RPL Routers "near the DODAG Root" and "far from the
DODAG Root" is not homogenous, thus some sort of administrative deployment, and
continued administrative maintenance of devices, as the network evolves, is
needed.

It would be helpful to state the requirements, so that the reader would
understand what kind of demands are placed, rather than just say "not
sufficient".

Indeed,
[rpl-eval-UCB 
<http://tools.ietf.org/html/draft-clausen-lln-rpl-experiences-05#ref-rpl-eval-UCB
<http://tools.ietf.org/html/draft-clausen-lln-rpl-experiences-05#ref-rpl-eval-UCB>>]
argues that practical experiences suggest that RPL in storing mode, with RPL
Routers having 10kB of RAM, should be limited to networks of less than ~30 RPL
Routers.

I read through the reference (quickly), but I didn't actually take the claim at
the face value. I think you need to open it up a bit. Why? Is that an
implementation problem or a fundamental limitation. For a quick
back-of-the-envelope calculation, a 16 bit CPU should be able to construct a
linked list in 2 bytes, with each entry pointing to an address (16 bytes) and
prefix (9 bytes), i.e., only 27 entries per router. So why does the reference
see just 30 RPL routers?

This is the second batch of my review.

Jari

    In the Internet, no single router stores explicit routing entries for
    all destinations.  Rather, IP addresses are assigned hierarchically,
    such that an IP address does not only uniquely identify a network
    interface, but also its topological location in the network, as
    illustrated in Figure 2.  All addresses with the same prefix are
    reachable by way of the same router - which can, therefore, advertise
    only that prefix.  Other routers need only record a single routing
    entry for that prefix, knowing that as the IP packet reaches the
    router advertising that prefix, more precise routing information is
    available.

This seems to be conflating two issues. I think you can categorically say that
in the internet, you do store routing entries for all destinations. But I don't
think you can say that hierarchical addressing is used in routing. In fact, you
could argue that while hiearchies are used in addressing, a lot of the routing
runs on de-aggregated entries, for better or worse. (But I'm not the routing
guy, so maybe I'm mistaken.)

And I wish Section 8.1 would talk about the implications of running the 6lowpan
layer that uses the same /64 for the entire network. What does that imply? Do
the issues that you list only relate to the hierarchical model?

    [RFC6550] discusses some mechanisms which can (if deemed needed) be
    used to verify that a link is bidirectional before choosing an RPL
    Router as a parent - but does not specify nor recommend one of these
    for use.

But it does require a mechanism to be used, as is described in the below text
from the RFC. I think the above text is not quite as clear as it could. Maybe
"does not specific which method to be used but does require one to be in use"?

   RPL expects an external mechanism to be triggered
   during the parent selection phase in order to verify link properties
   and neighbor reachability.  Neighbor Unreachability Detection (NUD)
   is such a mechanism, but alternates are possible

    NUD is based upon observing if a data packet is making forward
    progress towards the destination, either by way of indicators from
    upper-layer protocols (such as TCP and, though not called out in
    [RFC4861  <http://tools.ietf.org/html/rfc4861
    <http://tools.ietf.org/html/rfc4861>>], also from lower-layer protocols
    such as Link Layer ACKs ) or - failing that - by unicast probing by way of
    transmitting a unicast Neighbor Solicitation message and expecting that a
    solicited Neighbor Advertisement message be returned.

Actually, I think outlawing lower-layer protocols for NUD usage was an explicit
goal for the designers of ND. The idea is that you can't be sure you have an IP
entity up and running at the other end, unless you have exchanged IP packets.

- - - - - - -

From Fred Baker:

I’m reviewing the paper, trying to offer suggestions. Some of them are stream
of consciousness questions - “you said that and here’s what went through my
mind”. Many of these are of the general form “I might have said that another
way”, targeting clarity of language. Take them or leave them as you see clarity
improved.

BTW: may I say something comparable to “the king isn’t wearing anything”? When
we concoct a term like "Low-power-and-lossy-networks”, which is a plural term,
one would expect it to include plural kinds of networks. In point of fact, I
don’t think it does. It includes 802.15.4 that is not 802.15.4g (which permits
2K byte messages), and one might expect it to include Homeplug (IEEE 1911 if
memory services, and a certain ITU category derived from that) and PLC
networks. As far as I know, it includes Zigbee SEP 2.0 networks based on
802.15.4. AT some point, it seems like it might be worth describing a plurality
of networks, or saying we are targeting exactly one.

> Clausen, et al.          Expires April 30, 2015                 [Page 1]
> Internet-Draft             Observations of RPL              October 2014

Dumb question: are you observing RPL, or making observations about it and its
development? I think you mean “Observations ON RPL”, not “Observations OF RPL”.

>   [roll-charter] states that "Typical traffic patterns are not simply
>   unicast flows (e.g. in some cases most if not all traffic can be
>   point to multipoint)", and [RFC7102] further categorizes the

run-on sentence. Remove ', and' and replace it will a period and the start of a
new sentence.

Having found a run-on sentence so early, and noting that ', and' shows up 55
times in the document, I'd suggest you search for the construct and ask
yourself whether the uses could be usefully separated into two sentences. There
are many that simple identify the final item in a list, which is the proper
usage. I'll bet there are many that are in fact run-on sentences.

>   supported traffic types into "upward" traffic from sensors to a
>   collection sink or LBR (LLN Border Router) (denoted multipoint-to-
>   point), "downward" traffic from the collection sink or LBR to the
>   sensors (denoted point-to-multipoint) and traffic from "sensor to
>   sensor" (denoted point-to-point traffic), and establishes this

Could it be "sensor to actuator" or "sensor to decision server"?

>   terminology for these traffic types.  Thus, while the target for RPL
>   and ROLL is to support all of these traffic types, the emphasis among
>   these, according to [roll-charter], appears to be to optimize for
>   multipoint-to-point traffic, while also supporting point-to-
>   multipoint and point-to-point traffic.

>   The observations made in this document, except for when explicitly

"except when”

>   noted otherwise, do not depend on any specific implementation or
>   deployment, but can be understood from simply analyzing the protocol

s/from simply/by/

>   specification [RFC6550].  That said, all observations made have been
>   confirmed to also be present in, at least, some deployments or test
>   platforms with RPL, i.e., have been experimentally confirmed.

> 3.  RPL Overview
>
>   The basic construct in RPL is a "Destination Oriented Directed
>   Acyclic Graph" (DODAG), depicted in Figure 1, with a single router
>   acting as DODAG Root.  The DODAG Root has responsabilities in
>   addition to those of other routers, including for initiating,
>   configuring, and managing the DODAG, and (in some cases) acting as a
>   central relay for traffic through and between routers in the LLN.
>
>                                  (s)
>                                 ^ ^ ^
>                                /  |  \
>                              (a)  |   (b)
>                              ^   (c)    ^
>                             /     ^     (d)
>                            (f)    |    ^  ^
>                                  (e)--/    \
>                                             (g)
>
>                            Figure 1: RPL DODAG
>
>   In an LLN, in which RPL has converged to a stable state, each router
>   has identified a stable set of parents, each of which is a potential
>   next-hop on a route towards the DODAG Root.  One of the parents is
>   selected as preferred parent.  Each router, which is part of a DODAG
>   (i.e., which has selected parents and a preferred parent) will emit

This should be either "Each router that is part of a DODAG () will emit..." or
"Each router, [which is part of a DODAG ()], will emit... The comma is a way of
setting "which is part of a DODAG ()" off as a parenthetical explanatory
remark, as the portion in parentheses is. I think you mean "Each router that is
part of a DODAG () will emit…"

>   DODAG Information Object (DIO) messages, using link-local multicast,
>   indicating its respective rank in the DODAG (i.e., distance to the
>   DODAG Root according to some metric(s), in the simplest form hop-
>   count).  Upon having received a (number of such) DIO messages, a
>   router will calculate its own rank such that it is greater than the
>   rank of each of its parents, select a preferred parent and then
>   itself start emitting DIO messages.

Does the router recalculate that each time it receives such a message, or only
once or a few times in its lifetime? It sounds like it calculates it once, and
the time it calculates it is fuzzy. I’ll bet it calculates it each time it
receives such a message.

>   DODAG formation thus starts at the DODAG Root (initially, the only
>   router which is part of a DODAG), and spreads gradually to cover the
>   whole LLN as DIOs are received, parents and preferred parents are
>   selected, and further routers participate in the DODAG.  The DODAG
>   Root also includes, in DIO messages, a DODAG Configuration Object,
>   describing common configuration attributes for all routers in that
>   network - including their mode of operation, timer characteristics
>   etc. routers in a DODAG include a verbatim copy of the last received
>   DODAG Configuration Object in their DIO messages, permitting also
>   such configuration parameters propagating through the network.

In the course of time, I could imagine a router deciding it is part of one
DODAG, and then receiving DIOs that suggest it is or would be part of another.
In such a case, what is the rule? Does it stay in the first DODAG? Does it
change? Does it consider itself a member of both? How does it decide what to do?

>   As a Distance Vector protocol, RPL restricts the ability for a router

s/for/of/. It is an ability of the router’s.

>   to change rank.  A router can freely assume a smaller rank than
>   previously advertised (i.e., logically move closer to the DODAG Root)
>   if it discovers a parent advertising a lower rank, and must then
>   disregard all previous parents of ranks higher than the router's new
>   rank.  The ability for a router to assume a greater rank (i.e.,

s/for/of/

>   logically move farther from the DODAG Root) than previously
>   advertised is restricted in order to avoid count-to-infinity
>   problems.  The DODAG Root can trigger "global recalculation" of the
>   DODAG by increasing a sequence number, DODAG version, in DIO
>   messages.
>
>   The DODAG so constructed is used for installing routes: the
>   "preferred parent" of a router can serve as a default route towards
>   the DODAG Root, and the DODAG Root can embed in its DIO messages the
>   destination prefixes, included by DIOs generated by routers through
>   the LLN, to which connectivity is provided by the DODAG Root.  Thus,
>   RPL by way of DIO generation provides "upward routes" or "multipoint-
>   to-point routes" from the sensors inside the LLN and towards the
>   DODAG Root (and, possibly, to destinations reachable through the
>   DODAG Root).

This may be part of the answer to my question a couple of paragraphs earlier.
Are there other cases?

>   "Downward routes" are enabled by having sensors issue Destination
>   Advertisement Object (DAO) messages, propagating as unicast via
>   preferred parents towards the DODAG Root.  These describe which
>   prefixes belong to, and can be reached via, which router.  In a
>   network, all routers must operate in either of storing mode or non-
>   storing mode, specified by way of a "Mode of Operation" (MOP) flag in
>   the DODAG Configuration Object from the DODAG Root.  Those two modes
>   are non-interoperable, i.e., a mixture of routers running in
>   different modes is impossible in the same routing domain.  Depending
>   on the MOP, DAO messages are forwarded differently towards the DODAG
>   Root:
>
>   o  In "non-storing mode", a router originates a DAO messages,

“a” is singular, “messages” is plural. I think you mean “a router originates
DAO messages”.

>      advertising one or more of its parents, and unicasts these to the

“and unicasts” is probably correct usage, but I think I might write it
“unicasting”.

>      DODAG Root.  Once the DODAG Root has received DAOs from a router,
>      and from all routers on the route between it and the DODAG Root,
>      it can use source routing for reaching advertised destinations
>      inside the LLN.

Would this be more simply stated “Once the DODAG Root has received DAOs from
each router along a path to another given router, it can use source routing to
reach advertised destinations within the LLN”.

>   o  In "storing mode", each router on the route between the originator
>      of a DAO and the DODAG Root records a route to the prefixes
>      advertised in the DAO, as well as the next-hop towards these (the
>      router, from which the DAO was received), then forwards the DAO to
>      its preferred parent.

Does it really store the path AND the next hop? The next hop would be the first
hop in the path if it stores the path. If everyone is storing, yu don’t need to
store the path, only the next hop.

>   "Point-to-point routes", for communication between devices inside the
>   LLN and where neither of the communicating devices are the DODAG
>   Root, are as default supported by having the source sensor transmit a

"by default”, or “supported (by default) ..."

>   data packet, via its default route to the DODAG Root (i.e., using the
>   upward routes), which will then, depending on the "Mode of Operation”
>   for the DODAG, either add a source-route to the received data packet
>   for reaching the destination sensor (downward routes in non-storing
>   mode), or simply use hop-by-hop routing (downward routes in storing
>   mode) for forwarding the data packet.

That is a l-o-n-g s-e-n-t-e-n-c-e. I think I would break it up:

“Point-to-point routes” are used to communicate between non-root nodes in the
DODAG. These may use hop-by-hop routing, as is done in general routed networks,
or go via the DODAG Root. In the latter case, the packet follows a default
route to the DODAG Root, which attaches a source route to it. The source route
is then used to deliver the packet.

To me, that’s more understandable.

> In the case of storing mode,
>   if the source and the destination for a point-to-point data packet
>   share a common ancestor other than the DODAG Root, a downward route
>   may be available in a router (and, thus, used) before the data packet
>   reaches the DODAG Root.

In storing mode, it is following next hops. In the case, they would point
downward. Is this paragraph supposed to surprise me? If so, I missed something.

> 3.1.  RPL Message Emission Timing - Trickle Timers
>
>   RPL message generation is timer-based, with the DODAG Root being able
>   to configure back-off of message emission intervals using Trickle
>   [RFC6206].  Trickle, as used in RPL, stipulates that a router
>   transmits a DIO "every so often" - except if receiving a number of
>   DIOs from neighbor routers, enabling the router to determine if its

should this comma be a dash?

>   DIO transmission is redundant.
>
>   When a router transmits a DIO, there are two possible outcomes:
>   either every neighbor router that hears the message finds that the
>   information contained is consistent with its own state (i.e., the
>   received DODAG version number corresponds with that which the router
>   has recorded, and no better rank is advertised than that which is
>   recorded in the parent set) - or, a recipient router detects that
>   either the sender of the DIO or itself has out-of-date information.

Is there not a third passable case, that it fails to receive the message?

>   If the sender has out-of-date information, then the recipient router
>   schedules transmission of a DIO to update this information.  If the
>   recipient router has out-of-date information, then it updates based
>   on the information received in the DIO.
...
> 4.  Requirement Of DODAG Root

...

>   When operating in non-storing mode, this entails that the DODAG Root

Do you mean “entails” or “implies”? I suspect the latter.

>   is required to have sufficient memory and sufficient computational
>   resources to be able to record a network graph containing all routes
>   from itself and to all destinations and to calculate routes.
>
>   When operating in storing mode, this entails that the DODAG Root

implies?

>   needs enough memory to keep a list of all routers in the RPL
>   instance, and a next hop for each of those routers.  If aggregation
>   is used, the memory requirements can be reduced in storing mode (see
>   Section 8 for observations about aggregation in RPL).

...

> 4.1.  Observations

...

>   A router provisioned with resources to act as a DODAG Root, and
>   administratively configured to act as such, represents a single point
>   of failure for the DODAG it serves.

Why qualify it so much? Suppose it doesn’t have the resources it would need,
but is configured to act as a DODAG Root anyway. Would it not be a SPOF anyway?
I should think that a DODAG Root is a SPOF regardless, and a next hop is a SPOF
for the routes for which it is the next hop.

...

> 5.  RPL Data Traffic Flows

A general remark here. It really sounds as though the application
(communications among decision authorities, sensors, and actuators) is embedded
in the routing protocol, and along with it assumptions about the common ways
the application communicates. Speaking strictly for myself, that seems an odd
coupling (see RFC 3439 for issues in coupling), and may limit the utility of
the protocol. Maybe that’s the point you’re making. But RPL, at least as you
describe it, doesn’t route among communicating nodes per se; it routes between
DODAG Roots and nodes within a DODAG, and hopes that the structure of the
application superimposes the expected command/telemetry model on the DODAG
structure.

>   While not specifically called out thus in [RFC6550], the resulting
>   protocol design, however, reflects these assumptions in that the

leave “however” out. The “While” construct already said that.

>   mechanism constructing multipoint-to-point routes is efficient in
>   terms of control traffic generated and state required, point-to-
>   multipoint route construction much less so - and point-to-point
>   routes subject to potentially significant route stretch (routes going
>   through the DODAG Root in non-storing mode) and over-the-wire
>   overhead from using source routing (from the DODAG Root to the
>   destination) (see Section 7) - or, in case of storing mode,

- or, in the “storing mode case, ...

>   considerable memory requirements in all LLN routers inside the
>   network (see Section 7).

>   A router which wishes to act as a destination for data traffic

a router that...

>   ("downward routes" or "point-to-multipoint") issues DAOs upwards in
>   the DODAG towards the DODAG Root, describing which prefixes belong
>   to, and can be reached via, that router.
>
>   Point-to-Point routes between routers below the DODAG Root are
>   supported by having the source router transmit, via its default
>   route, data traffic towards the DODAG Root.  In non-storing mode, the
>   data traffic will reach the DODAG Root, which will reflect the data
>   traffic downward towards the destination router, adding a strict
>   source routing header indicating the precise route for the data
>   traffic to reach the intended destination router.  In storing mode,
>   the source and the destination may possibly (although, may also not)
>   have a common ancestor other than the DODAG Root, which may provide a
>   downward route to the destination before data traffic reaching the
>   DODAG Root.

Didn’t you say that in section 3?

> 5.1.  Observations
>
>   RPL is well suited for networks in which the sink for data traffic is
>   co-located with, (or is outside the LLN and reachable via), the DODAG
>   root.  However, these data traffic characteristics does not represent
>   a universal distribution of traffic types in LLNs.  There are
>   scenarios where the sink is not co-located with (or is outside the
>   LLN and reachable via) the DODAG.  These include:
>
>   o  Command/control networks in which sensor-to-sensor traffic is a
>      more common occurrence, documented, e.g., in [RFC5867] ("Building
>      Automation Routing Requirements in Low Power and Lossy Networks").

Why limit to sensor/sensor? If iI have a thermostat in my home, a furnace, and
a control system, they are probably all within my LLN. They are by definition a
sensor, an actuator, and a decision element, and there is no obvious reason why
any of them “should" be the DODAG Root.

>   o  Networks in which all traffic is bi-directional, e.g., in case
>      sensor devices in the LLN are, in majority, "actively read": a
>      request is issued by the DODAG Root to a specific sensor, and the
>      sensor value is expected returned.  In fact, unless all traffic in
>      the LLN is unidirectional, without acknowledgements (e.g., as in
>      UDP), and no control messages (e.g., for service discovery) or
>      other data packets are sent from the DODAG Root to the routers,
>      traffic will be bi-directional.  The IETF protocol for use in
>      constrained environments, CoAP [RFC7252], makes use of
>      acknowledgements to control packet loss and ensure that packets
>      are received by the packet destination.  In the four message types
>      defined for CoAP: confirmable, acknowledgement, reset and non-
>      confirmable, the first three are dedicated for sending/
>      acknowledgement cycle.  Another example is that the ZigBee
>      Alliance SEP 2.0 specification [SEP2.0] (adopted by the IEEE)
>      describes the use of HTTP over TCP over ZigBeeIP, between routers
>      and the DODAG Root - and with the use of TCP inherently causing
>      bidirectional traffic by way of data-packets and their
>      corresponding acknowledgements.  In fact, current Internet
>      protocols generally require some form of acknowledgment, and
>      foregoing an acknowledgment probably means a trade-off in the area
>      of reliable transmission or repeated retransmissions or both.
>
>   o  Telemetry scenarios where there the DODAG root and the sink are

s/there//

>      not co-located.  This can happen if different kinds of information
>      are sent to different central authorities for processing: for
>      example, temperature goes to Server A and humidity goes to Server
>      B. A possible solution for RPL is to run several DADAGs with
>      different roots, which incurs extra overhead.
>
>   For scenarios where sensor-to-sensor traffic is a more common

s/where/in which/

>   occurrence, all sensor-to-sensor routes include the DODAG Root,
>   possibly causing congestions on the communication medium near the

“congestion” or “congestion events”, dependong on whether you mean perfect
tense (continuing) or plural disparate events

>   DODAG Root, and draining energy from the intermediate routers on an
>   unnecessarily long route.  If sensor-to-sensor traffic is common,
>   routers near the DODAG Root will be particularly solicited as relays,
>   especially in non-storing mode.
>
>   For scenarios with bi-directional traffic, as there is no provision
>   for on-demand generation of routing information from the DODAG Root
>   to a proper subset of all routers, each router (besides the Root) is
>   required to generate DAOs.  In particular in non-storing mode, each
>   router will unicast a DAO to the DODAG Root (whereas in storing mode,
>   the DAOs propagate upwards towards the Root).  The effects of the
>   requirement to establish downward routes to all routers are:
>
>   o  Increased memory and processing requirements at the DODAG Root (in
>      particular in non-storing mode) and in routers near the DODAG Root
>      (in storing mode).
>
>   o  A considerable control traffic overhead [bidir], in particular at
>      and near the DODAG Root, therefore:
>
>   o  Potentially congested channels, and:
>
>   o  Energy drain from the routers.

regarding congestion, the obvious question is whether it is real. 802.15.4,
IIRC, is on the order of single digit MBPS, and telemetry channels that use it
are generally on the order of single digit messages per second. Is congestion a
real issue? If it is, the bit rate is an issue first.

> 6.  Fragmentation Of RPL Control Messages And Data Packet
>
>   Some link layers used in LLNs, such as IEEE 802.15.4 [ieee802154],
>   are unable to provide an MTU of at least 1280 octets - as otherwise
>   required for IPv6 [RFC2460].

unless, of course, they use 802.15.4g. Can you say “premature optimization”?

> In such LLNs, link fragmentation and
>   reassembly of IP packets at a layer below IPv6 is used to transport
>   larger IP packets, providing the required minimum 1280 octet MTU
>   [RFC4919].

...

>   frame size of 127 octets, as well as compressing the IPv6 header,
>   reducing the overhead of the IPv6 header from at least 40 octets to a
>   minimum of 2 octets.  Given the IEEE 802.15.4 frame size of 127

given that the...

>   octets, a maximum frame overhead of 25 octets and 21 octets for link
>   layer security [RFC4944], 81 octets remain for L2 payload.  Further
>   subtracting 2 octets for the compressed IPv6 header leaves 79 octets
>   for L3 data payload if link fragmentation is to be avoided.
>
>   The second L in LLN indicating Lossy [roll-charter], higher loss
>   rates than typically seen in IP networks are expected, rendering link
>   fragmentation important to avoid.  This, in particular because, as
>   mentioned above, the whole IP packet is dropped if only a single
>   fragment is lost [RFC4944].

Also true in ATM AAL5 with EPD.

> 6.1.  Observations
>
>   [RFC4919] makes the following observation

>      A ZIP node MUST ensure that the insertion of a RPL extension
>      header, either directly or via IPv6-in-IPv6 tunneling, does not
>      cause IPv6 fragmentation.  This is done by using a different MTU
>      value for packets where the IPv6 header includes a RPL extension
>      header.  The RPL tunnel entry point SHOULD be considered as a
>      separate interface whose MTU is set to the 6LoWPAN interface MTU
>      plus RPL_MTU_EXTENSION bytes.
>
>   Section 7.1 of [ZigBeeIP] defines RPL_MTU_EXTENSION to be 100 bytes.

Chewing my tongue.

> 10.  Neighbor Unreachability Detection For Unidirectional Links
>
>   [RFC6550] suggests using Neighbor Unreachability Detection (NUD)
>   [RFC4861] to detect and recover from the situation of unidirectional
>   links between a router and its (preferred) parent(s).  When, e.g., a
>   router tries (and fails) to actually use another router for
>   forwarding traffic, NUD is supposed engaged to detect and prompt
>   corrective action, e.g., by way of selecting an alternative preferred
>   parent.
>
>   NUD is based upon observing if a data packet is making forward

s/if/whether/

>   progress towards the destination, either by way of indicators from
>   upper-layer protocols (such as TCP and, though not called out in
>   [RFC4861], also from lower-layer protocols such as Link Layer ACKs )
>   or - failing that - by unicast probing by way of transmitting a
>   unicast Neighbor Solicitation message and expecting that a solicited
>   Neighbor Advertisement message be returned.

I have skipped commenting on a number of sections. It’s not that I didn’t read
them. I just didn’t have much to say. >

I will comment, though, that I could begin to get the impression that PL wasn’t
your very most favorite protocol.

============

May I commend you on a well-written document?

> 2.  Terminology
>       ...
>    Finally, this document introduces the following terminology:
>
>    RPL Router -  A device, running the RPL protocol, as specified by
>       [RFC6550].

This hits a personal hot button of mine. Truth be told, a router MIGHT run RPL,
and might run the same instance of RPL, on all of its interfaces. It might also
run one instance of RPL on one set of interfaces, another on another set of
interfaces, and OSPF, IS-IS, BGP, or simple static routing on a third set of
interfaces. Something that I see repeatedly in drafts is confusion between a
system (a bit of hardware), a (software) process running on it, and some subset
of its interfaces whether a proper subset or not.

I would really rather that you didn't use this term, at least with this
definition. I would far rather that you talked about a RPL Routing Process,
which implements RPL and serves some or all of the interfaces on a given bit of
equipment.

> 4.1

> In storing mode, the DODAG root needs to keep a routing entry for all RPL
Routers in the RPL instance.

Phraseology/terminology disconnect. When I read this, I mentally changed it to
"the DODAG root needs to keep a routing entry for ->each<- RPL Router in the
RPL instance". It doesn't keep one and presume that is information for all; it
keeps many, which are the various RIB/FIB entries for individual RPL Routers.

I might have misunderstood that.

In the larger paragraph surrounding that:

>    In a given deployment, select RPL Routers can be provisioned with the
>    required energy, memory and computational resources so as to serve as
>    DODAG Roots, and be administratively configured as such - with the
>    remainder of the RPL Routers in the network being of typically lesser
>    capacity.  In storing mode, the DODAG root needs to keep a routing
>    entry for all RPL Routers in the RPL instance.  In non-storing mode,
>    the resource requirements on the DODAG Root are likely much higher
>    than in storing mode, as the DODAG Root needs to store a network
>    graph containing complete routes to all destinations in the RPL
>    instance, in order to calculate the routing table (whereas in storing
>    mode, only the next hop for each destination in the RPL instance
>    needs to be stored, and aggregation may be used to further reduce the
>    resource requirements).

I think that in both storing mode and non-storing mode, the router needs to
keep state for each RPL router in the domain. What is different is that in
storing mode it stores the next hop, and in non-storing mode it stores a source
route. Correct?

>    RPL Routers provisioned with resources to act as DODAG Roots, and
>    administratively configured to act as such, represent a single point
>    of failure in the network.

Ignorant question: Does a DODAG always wrap up to a single root router, or can
it have multiple roots? If it can have multiple roots, you are not looking at a
single point of failure as much as a "final" point of failure - they all have
to fail to effect a failure. In any event, putting a plural (RPL Routers...) in
a sentence with the word "single" is a little jarring. I might rephrase as "An
RPL Router represents", or "may represent", "a single point of failure for each
DODAG it serves."

> 5.  RPL Data Traffic Flows
>
>    RPL makes a-priori assumptions of data traffic types, and explicitly
>    defines three such [I-D.ietf-roll-terminology] traffic types: sensor-
>    to-root data traffic (multipoint-to-point) is predominant, root-to-
>    sensor data traffic (point-to-multipoint) is rare and sensor-to-
>    sensor (point-to-point) data traffic is extremely rare.

If it were me, I would eschew the declarations about the paths of the data
streams in favor of describing their intent; if they are to be described per
se, I would use the more common terminology of "commands" and "telemetry".
Telemetry proceeds from a sensor to a sink, and commands originate from
someone/something that has knowledge and authority to a thing-being-commanded.
I realize you didn't pick the language. But in this particular taxonomy, there
is an implicit statement of the structure of both the network and the
application is supports, and while I do think the structure is a common one, I
would not expect it is the only one possible or desirable. In an 802.1d
Spanning Tree network, for example, there is similarly a root, and BPDUs
progress away from it to map the surrounding bridged domain. A controlling
system might be co-located with the rot, but there is no obvious reasons to
presume or require that.

This may be the point you are making in 5.1. There, in essence, you say that
RPL is well suited to networking in which telemetry travels from sensors to the
root, but is not well suited to networks in which the sink is not at or
accessed via the root.

> 6.  Fragmentation Of RPL Control Messages And Data Packet
>
>    Link layers, used in LLNs, are often unable to provide an MTU of, at
>    least, 1280 octets - as otherwise required for IPv6 [RFC2460].

An English major might correct me, but I would think the commas are not
required.

As an aside, this is not a characteristic of LLNs. It is a characteristic of
802.15.4, and specifically not a characteristic of 802.15.4g or IEEE 1901
Homeplug. I personally would be greatly gratified if the IETF would replace the
term "LLN" with "IEEE 802.15.4". We say a lot of things that are very
specifically incorrect when we commit the fallacy of generalizing the instance.

>    When such below-the-IP-layer fragmentation is used, the IP packet has
>    to be reassembled at every hop.  Every fragment must be received
>    successfully by the receiving device, or the entire IP packet is
>    lost.

IMHO, the segmentation process could be designed in a way that doesn't have
this requirement.

>    The second L in LLN indicating Lossy [roll-charter], higher loss
>    rates than typically seen in IP networks are expected, rendering
>    fragmentation important to avoid.  This, in particular because, as
>    mentioned above, the whole IP packet is dropped if only a single
>    fragment is lost.

True, but to my mind beside the point. We could use end to end IPv6
fragmentation, so that the RPL network simply passes the trash, or we could use
PMTU. Would the application messages get through more reliably? I doubt it.

> 6.1:

At least part of your observation here is that a RPL network does not in fact
carry IPv6 traffic; IPv6 is carried inside a tunnel header that in turn is
different in important ways. You say that, but I tend to think the discussion
would benefit from a laser-focused statement here. If the objective is to carry
IPv6 traffic, it would be nice if it actually did so without modifying the IPv6
header.

I didn't have a lot of comments on 7. Maybe that means I didn't read it well
enough.

> 8.  Address Aggregation and Summarization
>  ...
>    In the Internet, no single router stores explicit routing entries for
>    all destinations.   Rather, IP addresses are assigned hierarchically,
>    such that an IP address does not only uniquely identify a network
>    interface, but also its topological location in the network

I'd be a little careful in saying that aggregation is a characteristic of the
"Internet". The Internet is composed of autonomous networks, and they each
aggregate in degrees that make sense to them. Switched Ethernet, WiFi, Manet
networks, and (apparently) RPL/LLN route individual hosts. From my perspective,
it would be *nice* to be able to inject prefixes into RPL, but the observation
that it doesn't think in those terms is a fact, but not necessarily a criticism.

> 11.1.

s/In order to accommodate for /In order to accommodate /

I stopped reading at this point; I have other fish I need to fry today.
However, if you would like me to finish a review, I can do that, perhaps next
week. However, and this is perhaps a comment to Adrian, I'm willing to be the
shepherd for the document.

- - - - - - -
Back
Shepherd writeup draft-clausen-lln-rpl-experiences

Shepherd writeup
draft-clausen-lln-rpl-experiences