Micro-loop prevention by introducing a local convergence delay
draft-ietf-rtgwg-uloop-delay-09

Note: This ballot was opened for revision 06 and is now closed.

Alia Atlas Yes

Deborah Brungard No Objection

Ben Campbell No Objection

Comment (2017-10-11 for -07)
(Oops, sorry, I entered the bit about addressing my comments for the wrong draft. The following comments still apply.)

- General: Do I undertand correctly that this is a black-box implementation detail? I note that section 4 explicitly says that it is a local-only feature that does not require interoperability. If so, then standards track seems inappropriate. BCP or informational seems to make more sense. Since there are recommendations here, I think BCP is the right choice.  (I note Adam made a similar comment.)

-11: Do you expect this section to stay in the RFC? It is likely to become outdated rather quickly.

Editorial Comments:

- General: Please number the tables.

- sections 2 and 3 and their child sections have quite a few grammar errors. Please proofread it again. I mention a few specifics below, but doubt I caught everything.

- 2, first paragraph: " That means that all non-D neighbors of S on the topology will send to S any traffic destined to D if a neighbor did not, then that neighbor would be loop-free."
I can't parse that sentence. Is it a run-on sentence, or are there missing words?
-- S / "can be work" / "can work"

-3: " may cause high damages for a network."
I suggest " may cause significant network damage".

-4, last paragraph: "This benefit comes at the expense of eliminating transient forwarding loops involving the local router. "
How is that an "expense"? Isn't it the whole point?

-5.3, first paragraph and paragraph before figure 4:
The MUST is stated twice. Please avoid redundant normative statements. Even if they agree now, they can cause maintenance issues down the road.

Alissa Cooper No Objection

Spencer Dawkins No Objection

Suresh Krishnan No Objection

Warren Kumari No Objection

Comment (2017-10-11 for -07)
Section 1.  Introduction:
"That means that all non-D neighbors of S on the
   topology will send to S any traffic destined to D if a neighbor did
   not, then that neighbor would be loop-free."
 -- I was unable to parse the above. I may just be overtired, but it feels like there are some missing words.


Nits:
" When S-D fails, a transient forwarding loop may appear between S and
   B if S updates its forwarding entry to D before B."
 -- Perhaps "... entry to D before B does." or "... before B updates its forwarding entry"? 

Section 2.1.  Fast reroute inefficiency
"On the  router C, the nexthop to D is the tunnel T thanks to the IGP  shortcut." 
s/the// 

"On C, the tail-end of the TE tunnel (router B) is no more on the shortest-path tree (SPT) to D, ..."
s/is no more on/is no longer on/
(related)
"... so C does not encapsulate anymore the traffic to D..."
s/does not encapsulate anymore/no longer encapsulates/

Section 3.  Overview of the solution
"This ordered convergence, is similar to the ordered FIB ..."
s/,/ (superfluous).

Mirja K├╝hlewind No Objection

Comment (2017-10-09 for -06)
Nit in section 9:
You should probably not talk about 'our' solution or mechanism in an RFC:
s/our/this/ or s/our X/the X described in this document/ 
This appears multiple times in section 9.

Kathleen Moriarty No Objection

Comment (2017-10-10 for -07)
Thanks for addressing the SecDir review comments:
https://mailarchive.ietf.org/arch/msg/secdir/tnRc2LPp6FqfDeyqd2cJExEtdXA

Eric Rescorla No Objection

Comment (2017-10-11 for -07)
Line 115
   Consider the case in Figure 1 where S does not have an LFA to protect
   its traffic to D.  That means that all non-D neighbors of S on the
You need to define LFA.


Line 118
   topology will send to S any traffic destined to D if a neighbor did
   not, then that neighbor would be loop-free.  Regardless of the
   advanced fast-reroute (FRR) technique used, when S converges to the
This is not a grammatical sentence.


Line 132
        S ------ B
             1
        Figure 1
What do the numbers in this box mean? I assume they are route metrics, but you need to say so.


Line 136
   When S-D fails, a transient forwarding loop may appear between S and
   B if S updates its forwarding entry to D before B.
Something seems to have gone badly wrong with this paragraph. Are these lines supposed to be in the previous paragraph.


Line 326
      unstable.  As an example, [I-D.ietf-rtgwg-backoff-algo] defines a
      standard SPF delay algorithm.
You need to define SPF here.


Line 338
   1.  The Up/Down event is notified to the IGP.
Usually, one would say that the IGP is notified of...


Line 552
           S

             Figure 7
Is this the same as the previous figure with T running CEAB?

Alvaro Retana (was Discuss) No Objection

Adam Roach No Objection

Comment (2017-10-10 for -07)
This document doesn't really define any new on-the-wire protocol. Was publication as a BCP rather than a standards track document considered?

The Introduction contains the following text:

   That means that all non-D
   neighbors of S on the topology will send to S any traffic destined to
   D if a neighbor did not, then that neighbor would be loop-free.

I can't parse this sentence. Is there supposed to be a sentence break somewhere in there?

The introduction starts talking about post-failure events (e.g., "when S converges to the new topology") before mentioning a failure of the S-D link. This makes it very hard to follow. Would suggest mentioning the failure being considered before talking about the ensuing events.

Section 4 begins:

   This document defines a two-step convergence initiated by the router
   detecting a failure and advertising the topological changes in the
   IGP.  This introduces a delay between the convergence of the local
   router and the network wide convergence.

This reads backwards to me. With this technique, the network converges first, followed by an introduced delay, followed by router convergence. Right?

Further on in that section:

   This benefit comes at the
   expense of eliminating transient forwarding loops involving the local
   router.

I can't make sense of this. Eliminating transient forwarding loops is a good thing, right? Not an expense?

I agree with Alvaro that the lack of a recommended default for ULOOP_DELAY_DOWN_TIMER is an issue, especially as the values configured in the examples seem to change arbitrarily from 1 second to 2 seconds.