Neighbor Unreachability Detection Is Too Impatient
RFC 7048

Note: This ballot was opened for revision 06 and is now closed.

(Jari Arkko) Yes

(Brian Haberman) Yes

(Joel Jaeggli) Yes

(Ted Lemon) Yes

Comment (2013-06-12 for -06)
No email
send info
The document looks good.   Is there any hope of a 4861bis?

(Richard Barnes) No Objection

(Stewart Bryant) No Objection

Comment (2013-06-10 for -06)
No email
send info
I found the chatty style at odds with the demands of a precise definition of the operation of a state machine.

Please can you look at the following:

"Giving up after three packets spaced one second apart..."

You need to say what you are "giving up", but in any case I am not sure "giving up" is the right term.

+++

"from implementations that try for a long time"
Try what?

+++

"link-layer address of the destination has changed"

Isn't that the LL addr of the neighbor?

+++

"we will instead transition"

I imagine the node will rather than the authors

+++

"then it makes sense to stop.."

Don't you mean: "it is RECOMMENDED"

+++

"but it MUST switch to multicast Neighbor Solicitations sooner or later."

Do you really mean "when it can be bothered" which is what the statement says?


====

A few nits that might usefully be addressed if you are re-spinning the draft:

"If implementations" I think that should be "If an implementation"

(Gonzalo Camarillo) No Objection

(Benoît Claise) No Objection

(Spencer Dawkins) No Objection

Comment (2013-06-03 for -06)
No email
send info
I like this document. It's short and clear. I do have one comment I'm hoping you'll consider.

In this text: 4.  Example Algorithm

   This section is NOT normative, but specifies a simple implementation
   which conforms with this document.

I'm seeing the "NOT normative" statement, which is fine, but I'm also seeing several occurrences of "recommended" in the following paragraphs. I'm not confused by the use of "recommended" in lower case. My issue is that I'm not sure where the recommendation is coming from - whether it's in the context of this example algorithm, or from somewhere else.

If I'm guessing right, and the context is this example algorithm, would it be easier to understand if you replace "recommended behavior" with something like "behavior used by this implementation"?

   The recommended behavior is to have 5 attempts, with timing spacing
   of 0 (initial request), 1 second later, 3 seconds after the first
   retransmission, then 9, then 27, and switch to UNREACHABLE after the
   first three transmissions.  Thus relative to the time of the first
   transmissions the retransmissions would occur at 1 second, 4 seconds,
   13 seconds, and finally 40 seconds.  At 4 seconds from the first
   transmission the NCE would be marked UNREACHABLE.  That recommended
   behavior corresponds to:

      MAX_UNICAST_SOLICIT=5

      RETRANS_TIMER=1 (default)

      MAX_RETRANS_TIMER=60

      BACKOFF_MULTIPLE=3

      MARK_UNREACHABLE=3

   After 3 retransmissions the implementation would mark the NCE
   UNREACHABLE.  That results in trying an alternative neighbor, such as
   another default router or ignoring a redirect as specified in
   [RFC4861].  With the above recommended values that would occur after
   4 seconds after the first transmission compared to the 2 seconds
   using the fixed scheme in [RFC4861].  That additional delay is small
   compared to the default 30,000 milliseconds ReachableTime.

   After 5 transmissions, i.e., 40 seconds after the initial
   transmission, the recommended behavior is to switch to multicast NUD
   probes.  In the language of the state machine in [RFC4861] that
   corresponds to the action "Discard entry".  Thus any attempts to send
   future packets would result in sending multicast NS packets.  An
   implementation MAY retain the backoff value as it switches to
   multicast NUD probes.  The potential downside of deferring switching
   to multicast is that it would take longer for NUD to handle a change
   in a link-layer address i.e., the case when a host or a router
   changes their link-layer address while keeping the same IPv6 address.
   However, [RFC4861] says that a node MAY send unsolicited NS to handle
   that case, which is rather infrequent in operational networks.

(Adrian Farrel) No Objection

Comment (2013-06-12 for -06)
No email
send info
I support the comments from other ADs about precision in language. I 
think you would be well advised to tighten the specification.

---

The proposed new state machine entry is:

   PROBE           Retransmit timeout,     Increase timeout  UNREACHABLE
                   N or more               Send multicast NS
                   retransmissions.

How is it possible for the event of "more than N retransmissions" to 
happen in PROBE state? You probably need:

   PROBE           Retransmit timeout,     Increase timeout  UNREACHABLE
                   Nth retransmission      Send multicast NS

---

Although you have retained the garbage collection from 4861, you say:

   A node MAY garbage collect a Neighbor Cache Entry at any time as
   specified in RFC 4861.  This does not change with the introduction of
   the UNREACHABLE state in the conceptual model.

I would have thought that the garbage collection LRU scheme should 
consider a trade between LRU and UNREACHABLE state? Which would you 
discard: not used for 29 seconds or unreachable and not used for 28
seconds? Or is this too much fine-tuning?

(Stephen Farrell) No Objection

Barry Leiba No Objection

Comment (2013-06-08 for -06)
No email
send info
I'm just short of a DISCUSS on this because there was no response to the appsdir review, which did raise a couple of minor things apart from the many editorial comments in it.  I know that Erik isn't happy with the level of editorial commenting in the directorate reviews, but we have to accept that thorough reviewers feel the need to do that.

Please do address Murray's comments (and my reply to it) that go beyond mere editorial stuff.

-- Section 3 --
   A node MAY unicast the first few Neighbor Solicitation messages even
   while in UNREACHABLE state, but it MUST switch to multicast Neighbor
   Solicitations sooner or later.

This was brought up in the appsdir review: "first few" and "sooner or later" are sufficiently inspecific that it calls the need for a 2119 MUST into question.  How is the operation of the protocol affected if the node interprets "first few" to be 3?  How about 12?  What about 3,141,592?  Is it OK if I decide that "sooner or later" amounts to "just before the Rapture"?

It would probably better here to say what the problem is that the switch to multicast will address, and it's most likely better to skip the 2119 words on that.  Something on the order of this:

   A node can unicast the first few Neighbor Solicitation messages even
   while in UNREACHABLE state, but [this bad stuff will happen] until it
   switches to multicast Neighbor Solicitations.


Other comments; no need to respond to these. Take them or modify them
as you please:

-- Section 1 --

nit: "These short can be" -> "These short timeouts can be"

The last sentence of the first paragraph is punctuated oddly, and I'm not quite sure what it really means:

   In these cases, when NUD fails,
   the host will try the alternative neighbor; the next router in the
   Default Router List, or discard the NCE which will also send using a
   different router.

Is the semicolon just wrong, and there are meant to be three items in the list of what can happen when NUD fails?  If so:

NEW
   In these cases, when NUD fails
   the host will try the alternative neighbor, try the next router in
   the Default Router List, or discard the NCE which will also send
   using a different router.

(You need to repeat the verb "try" because the third item in the list has a different verb.)

Still on that sentence, what does the last item mean?:

   discard the NCE which will also send using a different router.

It looks like this is saying that the third option is to discard the NCE, and a result of that will be a switch to a different router.  Is that right?  If so, re-wording a little will help; "will send using a different router" makes me think that there's some noun missing after "send".  Maybe, "discard the NCE (which also results in the use of a different router)."

-- Section 3 --

   Packets should be sent
   following the next-hop selection algorithm in section 5.2 in
   [RFC4861] which disregards NCEs that are not reachable.

There's nothing at all wrong with this, but I want to note that if you write it this way:

...algorithm in [RFC4861], Section 5.2, which...

then the tool that converts the RFCs to HTML will generate a link to the correct section directly.  Otherwise, the link will just go to the top of 4861.

(Pete Resnick) No Objection

(Martin Stiemerling) No Objection

(Sean Turner) No Objection