Summary: Needs a YES. Has 2 DISCUSSes. Needs 3 more YES or NO OBJECTION positions to pass.
Section 3. Can the normative reference for the SAX algorithm be clarified. The text cites [SAX-DASFAA], but this is an informative reference (and an academic paper with no URL). Appendix B, appears to also describe an algorithm but the introduction describes the text as “an example implementation SAX hash function”.
** Section 4.2. of RFC8480 outlines requirements for a specification of an SF. One of those it that the SF “SHOULD clearly state the application domain the SF is created for.” Is there one for this draft? ** Section 5.1. Per “In case that a node booted or disappeared from the network, the cell reserved at the selected parent may be kept in the schedule forever. A clean-up mechanism MUST be provided to resolve this issue.”, is there an implicit DDoS being described here where rogue nodes could boot up and hold cells before the clean-up mechanism activates? ** Appendix B. Per step #8, what does this do? h should already have the final value of Step 4 assigned to h. ** Editorial Nits: -- Section 4.4. Typo. s/Neighbor Dicovery/Discovery/ -- Section 16. Typos. s/MSF adapts to traffics containing packets from IP layer/MSF adapts to traffic containing packet from the IP layer/
I'm concerned that the scheduling function for autonomous cells can cause an infinite loop in the case of hash collision -- Section 3 specifies that AutoTxCell always takes precedence over AutoRxCell, but if those two cells collide, the corresponding cells on the peer in question will also collide. If both peers try to send at the same time and the hashes collide, they will both attempt to transmit indefinitely and never be received. There seems to be some "passing the buck" going on with respect to rate-limiting unauthenticated (join) traffic: draft-ietf-6tisch-minimal-security (Section 6.1.1) says that the SF "SHOULD NOT allocate additional cells as a result of traffic with code point AF43"; this document is implementing a SF, and yet we try to avoid the issue, saying that "[t]he at IPv6 layer SHOULD ensure that this join traffic is rate-limited before it is passed to 6top sublayer where MSF can observe it". I think we need a clear and consistent story about where this rate-limiting is supposed to happen.
I support Roman's Discuss -- we need more information for this to be a useful reference; even what seem to be the official DASFAA 1997 proceedings (https://dblp.org/db/conf/dasfaa/dasfaa97) do not have an associated document). Basing various scheduling aspects on (a hash of) the EUI64 ties functionality to a persistent identifier for a device. How significant a disruption would be incurred if a device periodically changes its presented EUI64 for anonymization purposes? There seems to be a general pattern of "if you don't have a 6P-negotiated Tx cell, install and AutoTxCell to send your one message and then remove it after sending"; I wonder if it would be easier on the reader to consolidate this as a general principle and not repeat the details every time it occurs. Requirements Language "NOT RECOMMENDED" is not in the RFC2119 boilerplate (but is a BCP 14 keyword). Section 1 the 6 steps described in Section 4. The end state of the join process is that the node is synchronized to the network, has mutually authenticated to the network, has identified a routing parent, and nit(?): I guess maybe "mutually authenticated with" is more correct for the bidirectional operation. It does so for 3 reasons: to match the link-layer resources to the traffic, to handle changing parent, to handle a schedule collision. nit: end the list with "or" (or "and"?). MSF works closely with RPL, specifically the routing parent defined in [RFC6550]. This specification only describes how MSF works with one routing parent, which is phrased as "selected parent". The nit: I suggest '''one routing parent; this parent is referred to as the "selected parent"'''. activity of MSF towards to single routing parent is called as a "MSF nit: "towards the" * We added sections on the interface to the minimal 6TiSCH configuration (Section 2), the use of the SIGNAL command (Section 6), the MSF constants (Section 14), the MSF statistics (Section 15). nit: end the list with "and". Section 2 In a TSCH network, time is sliced up into time slots. The time slots are grouped as one of more slotframes which repeat over time. The nit(?): should this be "one or more"? channel) is indicated as a cell of TSCH schedule. MSF is one of the policies defining how to manage the TSCH schedule. nit: if there is only one such policy active at a given time for a given network, I suggest "MSF is a policy for managing the TCSH schedule". (If multiple policies are active simultaneously, no change is needed.) MSF uses the minimal cell for broadcast frames such as Enhanced Beacons (EBs) [IEEE802154] and broadcast DODAG Information Objects (DIOs) [RFC6550]. Cells scheduled by MSF are meant to be used only for unicast frames. If this paragraph was moved before the previous paragraph, then EB and DIO would be defined before their first usage. bandwidth of minimal cell. One of the algorithm met the rule is the Trickle timer defined in [RFC6206] which is applied on DIO messages [RFC6550]. However, any such algorithm of limiting the broadcast nit(?): "One of the algorithms that fulfills this requirement"? MSF RECOMMENDS the use of 3 slotframes. MSF schedules autonomous cells at Slotframe 1 (Section 3) and 6P negotiated cells at Slotframe 2 (Section 5) , while Slotframe 0 is used for the bootstrap traffic as defined in the Minimal 6TiSCH Configuration. It is RECOMMENDED to use the same slotframe length for Slotframe 0, 1 and 2. Thus it is Perhaps this is just a question of writing style, but if an implementation is free to use an alternative SF or a variant of MSF, could we not say that "MSF uses 3 slotframts", "MSF uses the same slotframe length for", etc.? Section 3 Is there any risk of unwanted correlation between slot and channel offsets when using the same hash function and input for both calculations? hash function. Other optional parameters defined in SAX determine the performance of SAX hash function. Those parameters could be broadcasted in EB frame or pre-configured. For interoperability purposes, an example how the hash function is implemented is detailed in Appendix B. Given the lack of usable reference for [SAX-DASFAA], I assume that the content in Appendix B is going to be used as a specification, not just an example. * The AutoRxCell MUST always remain scheduled after synchronized. nit: s/synchronized/synchronization/ AutoRxCell. In case of conflicting with a negotiated cell, autonomous cells take precedence over negotiated cell, which is stated in [IEEE802154]. However, when the Slotframe 0, 1 and 2 use the same length value, it is possible for negotiated cell to avoid the collision with AutoRxCell. Presumably this factors in to the recommendation to have the three listed slotframes use the same length, but mentioning it explicitly (whether here or where the recommendation is made) might be nice. Section 4 network. Alternative behaviors may involved, for example, when alternative security solution is used for the network. Section 4.1 nit: singular/plural mismatch "behaviors"/"solution is used" Section 4.1 A node implementing MSF SHOULD implement the Minimal Security Framework for 6TiSCH [I-D.ietf-6tisch-minimal-security]. As a Didn't this get renamed to CoJP? Section 4.2 I a little bit wonder if there is a better description than "available frequencies" but don't have one to offer. Section 4.3 While the exact behavior is implementation-specific, it is RECOMMENDED that after having received the first EB, a node keeps listen for at most MAX_EB_DELAY seconds until it has received EBs from NUM_NEIGHBOURS_TO_WAIT distinct neighbors, which is defined in [RFC8180]. nit(?): this phrasing implies that only NUM_NEIGHBOURS_TO_WAIT is defined in RFC 8180, but MAX_EB_DELAY is also defined there. not-nit: this phrasing is ambiguous as to whether one of MAX_EB_DELAY and NUM_NEIGHBOURS_TO_WAIT is sufficient to move to the next step or whether both are required. Section 4.4 After selected a JP, a node generates a Join Request and installs an AutoTxCell to the JP. The Join Request is then sent by the pledge to its JP over the AutoTxCell. The AutoTxCell is removed by the pledge editorial: I'd suggest s/its JP/its selected JP/ Response is sent out. The pledge receives the Join Response from its AutoRxCell, thereby learns the keying material used in the network, as well as other configurations, and becomes a "joined node". nit: maybe "other configuration values" or "other configuration settings"? Section 4.6 Once it has selected a routing parent, the joined node MUST generate a 6P ADD Request and install an AutoTxCell to that parent. The 6P ADD Request is sent out through the AutoTxCell with the following fields: * CellOptions: set to TX=1,RX=0,SHARED=0 * NumCells: set to 1 * CellList: at least 5 cells, chosen according to Section 8 Is this listing describing the contents of the ADD request or the AuthTxCell used to send it? (I presume the former, in which case I suggest to use "containing" or similar in preference to "with".) Section 5.1 The goal of MSF is to manage the communication schedule in the 6TiSCH schedule in a distributed manner. For a node, this translates into monitoring the current usage of the cells it has to the selected parent: Is this goal strictly limited to traffic "to the selected parent" vs. all traffic? * If the node determines that the number of link-layer frames it is attempting to exchange with the selected parent per unit of time is larger than the capacity offered by the TSCH negotiated cells it has scheduled with it, the node issues a 6P ADD command to that parent to add cells to the TSCH schedule. * If the traffic is lower than the capacity, the node issues a 6P DELETE command to that parent to delete cells from the TSCH schedule. As written, this would potentially lead to oscillation when demand is basically at capacity, due to the quantization of capacity. Perhaps some provisioning for hysteresis is appropriate? The cell option of cells listed in CellList in 6P Request frame SHOULD be either Tx=1 only or Rx=1 only. Both NumCellsElapsed and NumCellsUsed counters can be used to both type of negotiated cells. Would this be more clear as "(Tx=1,Rx=0) or (Tx=0,Rx=1)"? * NumCellsElapsed is incremented by exactly 1 when the current cell is AutoRxCell. This holds for all peers/parents we're keeping counters for, so the AutoRxCell can get "double counted"? In case that a node booted or disappeared from the network, the cell reserved at the selected parent may be kept in the schedule forever. A clean-up mechanism MUST be provided to resolve this issue. The clean-up mechanism is implementation-specific. It could either be a periodic polling to the neighbors the nodes have negotiated cells with, or monitoring the activities on those cells. The goal is to confirm those negotiated cells are not used anymore by the associated neighbors and remove them from the schedule. I'm not sure that "monitoring the activities on those cells" is safe with the current level of specification; if a node negotiates a 6P transmit cell to a parent and uses it only sparingly, with the parent eventually reclaiming it due to inactivity, I don't see a mechanism by which the node will reliably discover the negotiated cell to be nonfunctional and fall back to (e.g.) the corresponding AutoTxCell. It may be most prudent to just not mention that as an example (a "periodic polling" procedure does not seem to have the same potential for information skew) Section 5.3 schedule is executed and the node sends frames to that parent. When NumTx reaches MAX_NUMTX, both NumTx and NumTxAck MUST be divided by 2. For example, when MAX_NUMTX is set to 256, from NumTx=255 and NumTxAck=127, the counters become NumTx=128 and NumTxAck=64 if one frame is sent to the parent with an Acknowledgment received. This operation does not change the value of the PDR, but allows the counters to keep incrementing. The value of MAX_NUMTX is implementation-specific. Does MAX_NUMTX need to be a power of two (to avoid errors when the division occurs)? 4. For any other cell, it compares its PDR against that of the cell with the highest PDR. If the difference is larger than RELOCATE_PDRTHRES, it triggers the relocation of that cell using a 6P RELOCATE command. The recommended RELOCATE_PDRTHRES is given as "50 %". Is this "difference" performed as a subtraction (so that if the highest PDR is less than 50%, no cells can ever be relocated) or a ratio (a PDR that's half than the maximum PDR or smaller will trigger relocation)? Section 7 Maybe reference Section 17.1 where the allocation will occur? Section 8 * The slotOffset of a cell in the CellList SHOULD be randomly and uniformly chosen among all the slotOffset values that satisfy the restrictions above. * The channelOffset of a cell in the CellList SHOULD be randomly and uniformly chosen in [0..numFrequencies], where numFrequencies represents the number of frequencies a node can communicate on. Do these random selections need to be independent from each other? (I note that the selection for the autonomous cells are not.) Section 9 Is there a reference for these three parameters (MAXBE, MAXRETRIES, SLOTFRAME_LENGTH)? SLOTFRAME_LENGTH seems new in this document and is listed in the table in Section 14, but the other two are not listed there. Section 14 Why is MAX_NUMTX not listed in the table? Can we really give a recommended NUM_CH_OFFSET value, since this is in effect dependent on the number of channels available? KA_PERIOD is defined but not used elsewhere in the document. What are the considerations in using a power of 10 vs. a power of 2 as MAX_NUM_CELLS? Section 16 MSF defines a series of "rules" for the node to follow. It triggers several actions, that are carried out by the protocols defined in the following specifications: the Minimal IPv6 over the TSCH Mode of IEEE 802.15.4e (6TiSCH) Configuration [RFC8180], the 6TiSCH Operation I'd suggest a brief note that the security considerations of those protocols continue to apply (even though it ought to be obvious); reading them could help a reader understand the behavior of this document as well. Sublayer Protocol (6P) [RFC8480], and the Minimal Security Framework for 6TiSCH [I-D.ietf-6tisch-minimal-security]. In particular, MSF [CoJP again] prevent it from receiving the join response. This situation should be detected through the absence of a particular node from the network and handled by the network administrator through out-of-band means, e.g. by moving the node outside the radio range of the attacker. "the radio range of the attacker" is not exactly a fixed constant ... attackers are not in general bound by legal limits and can increase Tx power subject only to their equipment and budget. MSF adapts to traffics containing packets from IP layer. It is possible that the IP packet has a non-zero DSCP (Diffserv Code Point [RFC2597]) value in its IPv6 header. The decision whether to hand RFC 2597 is talking more about specifically assured forwarding PHB groups than "DSCP codepoint"s per se. Section 18.1 RFC 6206 seems to only be used as an example (Trickle), and could probably be informative. RFC 8505 might also not need to be normative. Appendix B In MSF, the T is replaced by the length slotframe 1. String s is nit: "length of" 2. sum the value of L_shift(h,l_bit), R_shift(h,r_bit) and ci Is this addition performed in "infinite precision" integer arithmetic or limited to the output width of h, e.g., by modular division? (It's not clear to me whether this is the role T plays or not.) 8. assign the result of Step 5 to h The value from step 5 *is* h, so taken literally this says "assign h to h" and is not needed.
Please use the RFC 8174 boilerplate rather than the RFC 2119 boilerplate.
I agree with Roman's discuss that the relation to SAX-DASFAA should be clarified and if this is actually needed for interoperability (as stated at some point in the text) it seems this should be part of the body of the document. Or what are the requirements for interoperability? What can be changed in the "example" algorithm and what not? Two small technical points: 2) Sec 9; mostly double-checking as you probably know better than me: "6P timeout value is calculated as ((2^MAXBE)-1)*MAXRETRIES*SLOTFRAME_LENGTH" Often you calculate such a value and then multiply by 2 (or something) to be on the safe side, as there could be e.g. processing delays in the receiving node. I assume the assumption here is that you always need to get the response in the same/after one slot (?). If that is true, I guess the calculation is fine. But wanted to check that there cannot be any additional unknown delays here. Further, these values come a bit out of nothing. Where are MAXBE and MAXRETRIES defined? And if you have an exponential backoff that will stop retrying after MAXRETRIES why do you need also a timeout in addition to that? 2) Sec 16: " MSF adapts to traffics containing packets from IP layer. It is possible that the IP packet has a non-zero DSCP (Diffserv Code Point [RFC2597]) value in its IPv6 header. The decision whether to hand over that packet to MAC layer to transmit or to drop that packet belongs to the upper layer and is out of scope of MSF. As long as the decision is made to hand over to MAC layer to transmit, MSF will take that packet into account when adapting to traffic." Why should a packet be dropped based on it DSCP...? Maybe be a bit more neutral here like: " MSF adapts to traffics containing packets from IP layer. It is possible that the IP packet has a non-zero DSCP (Diffserv Code Point [RFC2597]) value in its IPv6 header. The decision how to handle belongs to the upper layer and is out of scope of MSF. As long as a decision is made to hand over to MAC layer to transmit, MSF will take that packet into account when adapting to traffic." Some small editorial nits/comments: 1) Sec 1: - Maybe expand RPL on first occurrence. - s/is called as a "MSF session"/is called a "MSF session"/ 2) Sec 2 - s/one of more slotframes/one or more slotframes/ 3) Sec 4.4 - Please expand JRC on first occurrence. Maybe add a glossary at the beginning? 4) Sec 5.1. " A node implementing MSF MUST implement the behavior described in this section." Not sure if that sentence brings any additional value because that's what specs are for. But I guess it also doesn't hurt. And respectively I find the statement in 5.3 rather confusing " A node implementing MSF SHOULD implement the behavior described in this section. The "MUST" statements in this section hence only apply if the node implements schedule collision handling." I'm not fully sure what this even means now. Can you explain? Can you maybe rather provide some text to explain when it could/MAY be appropriate to not implement it? 5) Sec 16: "The implementation at IPv6 layer SHOULD ensure that this join traffic is rate-limited before it is passed to 6top sublayer where MSF can observe it. " Maybe be less indirect here: "The implementation at IPv6 layer SHOULD rate-limited join traffic before it is passed to 6top sublayer where MSF can observe it." Also this wording is a bit unclear: " How this rate limit is set is out of scope of MSF." Maybe " How this rate limit is implemented is out of scope of MSF. 6) "Appendix A. Contributors" -> Usually Contributors is an own section in the body of the document and not part of the appendix but I'm sure the RFC editor will advise you correctly.
I was going to ask that you expand “DODAG” in first use, because it’s not marked as sufficiently common in the RFC Editor’s abbreviation list. But, really, I think the better answer is to ask the responsible AD to ask the RFC Editor to put that asterisk on both DAG and DODAG at this point.
(1) I support Roman's DISCUSS. (2) The datatracker should point at draft-chang-6tisch-msf being replaced by this document. (3) §2: "MSF RECOMMENDS the use of 3 slotframes." Why isn't this REQUIRED? How does an implementation signal to a neighboring node that a different number of slotframes are being used (or a different length, which is also RECOMMENDED later)? It seems to me that RECOMMENDING may not be enough for an interoperable implementation...but I may also be missing something in 802.15.4 or rfc8180 (or somewhere else). [BTW, the rfc2119 keyword is RECOMMENDED (not RECOMMENDS).] I think the use of RECOMMENDED (vs REQUIRED) may be related to this text a couple of paragraphs before: A node implementing MSF SHOULD implement the Minimal 6TiSCH Configuration [RFC8180], which defines the "minimal cell", a single shared cell providing minimal connectivity between the nodes in the network. The MSF implementation provided in this specification is based on the implementation of the Minimal 6TiSCH Configuration. However, an implementor MAY implement MSF based on other specifications as long as the specification defines a way to advertise the EB/DIO among the network. I understand that a configuration other than rfc8180 is possible, but if this document is based on rfc8180, then it would be clearer if the language was stronger (s/SHOULD/MUST) with the understanding that the specification refers to that case. (4) §4.3: While the exact behavior is implementation-specific, it is RECOMMENDED that after having received the first EB, a node keeps listen for at most MAX_EB_DELAY seconds until it has received EBs from NUM_NEIGHBOURS_TO_WAIT distinct neighbors, which is defined in [RFC8180]. rfc8180/§6.2 says that "after having received the first EB, a node MAY listen for at most MAX_EB_DELAY seconds until it has received EBs from NUM_NEIGHBOURS_TO_WAIT distinct neighbors." The use of RECOMMENDED here is not consistent with the optional nature of the MAY. (5) Nits... s/represents node's preferred parent/represents the node's preferred parent s/no restrictions to go multiple MSF sessions/no restrictions to use (?) multiple MSF sessions s/One of the algorithm met the rule/One of the algorithms that meet the rule s/Alternative behaviors may involved/Alternative behaviors may be involved s/when alternative security/when an alternative security s/node keeps listen/node keeps listening s/pairs of following counters/pairs of the following counters
Thank you for the work put into this document. Please find below some non-blocking COMMENTs and NITs. An answer will be appreciated. I hope that this helps to improve the document, Regards, -éric == COMMENTS == As Alissa's comment, please use RFC 8174 boiler plate. -- Section 3 -- Suggest to remove "The AutoRxCell MUST always remain scheduled after synchronized. * 6P CLEAR MUST NOT erase any autonomous cells." from the bulleted list and create a new paragraph for those 2 lines. -- Section 4 -- The whole section seems to assume that the events will work as expected. But, what if this is not the case? E.g., the JP does not send any reply ? -- Section 5.3 -- "we necessarily have NumTxAck <= NumTx" is only true is all nodes behave... "MUST be divided by 2", the example is about 127 divided by 2 giving the unexpected value (to me at least) of 64... The text should clarify how rounding is handled as it is not a plain right shift by 1. Step 2, is it also applicable to any value of MAX_NUMTX ? Including very small or very large ones ? == NITS == -- section 5.2 -- To be checked by a native speaker but s/can have a node switch parent/can have a node switching parent/ would make the text easier to parse. -- Section 14 -- Please order the rows of Figure 2.