Last Call Review of draft-ietf-mpls-mldp-node-protection-05
review-ietf-mpls-mldp-node-protection-05-genart-lc-davies-2015-08-28-00

Request Review of draft-ietf-mpls-mldp-node-protection
Requested rev. no specific revision (document currently at 08)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2015-09-08
Requested 2015-08-27
Draft last updated 2015-08-28
Completed reviews Genart Last Call review of -05 by Elwyn Davies (diff)
Genart Telechat review of -06 by Elwyn Davies (diff)
Secdir Last Call review of -05 by Shaun Cooley (diff)
Assignment Reviewer Elwyn Davies
State Completed
Review review-ietf-mpls-mldp-node-protection-05-genart-lc-davies-2015-08-28
Reviewed rev. 05 (document currently at 08)
Review result Almost Ready
Review completed: 2015-08-28

Review
review-ietf-mpls-mldp-node-protection-05-genart-lc-davies-2015-08-28




    I am the assigned Gen-ART reviewer for this draft. The General Area
    


    Review Team (Gen-ART) reviews all IETF documents being processed
    


    by the IESG for the IETF Chair.  Please treat these comments just
    


    like any other last call comments.
    





    For more information, please see the FAQ at
    







<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>

.
    





    Document:
    draft-ietf-mpls-mldp-node-protection-05


    Reviewer:
    Elwyn Davies


    Review Date:
    2015/08/28


    IETF LC End Date:
    2015/09/08


    IESG Telechat date: (if known)
    -





    Summary:
    


    Almost ready.  The distinction that (I believe) is the case, between
    the need for unidirectional bypass LDPs in the non-root case and
    bidirectional (or two unidirectional) ones in the root case appears
    to merit some additional explanation.  Otherwise, there are a couple
    of minor issues  that are only just above the level of nits.  In
    particular, there doesn't seem to be any explanation of why the PLRs
    and MPTs have to be directly connected to the protected node (RFC
    6388 is happy to find non-directly connected nodes) and I cannot see
    why direct connection is vital to this specification.  This may of
    course because I am not an MPLS afficionado!  There are, however, a
    considerable number of language nits which I have documented -
    another editorial pass by a reviewer with English mother tongue
    after fixing the nits I found would probably help.





    Major issues:


    None. 





    Minor issues:


    s1:  This specification uses LDP capability negotiation (RFC 5561). 
    The introduction should be clear that nodes involved in the node
    protection scheme MUST support RFC 5561 negotiations.





    s1, para 2:  There is a very dangerous slang phrase here: "most ...
    should just work"!   Does this have the literal meaning that there
    are some circumstances in which they might NOT work and/or there are
    other capabilities that aren't enabled - if so, what are they? Or, I
    guess more likely, the slang meaning that using this method will
    allow these capabilities to work without additional effort.  Please
    use a non-slang phrase and be more precise as to what will/won't
    work.





    s2.1:







   The upstream LSR in figure 1 is LSR1
   because it is the ***first hop*** along the shortest path to reach the root
   address.





    Does it have to be the 'first hop' (I read this as the first LSR on
    the path from N towards the root)?


    RFC 6388 s2.4.1.1 allows the upstream to be any LSR along the path
    according to some local selection policy.  Can't the PLR be any such
    upstream LSR?





    s2.2:  Following on from the previous comment, do LSR1..3 *have* to
    be 'directly connected'?  Can they not be just one from the set of
    LSRs along each separate LSR emanating from the root?





    s2.2:  If I understand correctly, the bypass LSPs have to be
    bidirectional (or they could be two unidirectional ones) unlike
    those in s2.1 which will be unidirectional.  I think this ought to
    be mentioned, assuming I am right - and presumably one could do a
    bit of optimisation in setup.  This has some knock-on effects as
    regards what happens when the node fails.  I wonder if there should
    be some explanation of what happens in an extra sub-section in s4 -
    just that the various LSRs need to think about what role they are
    playing depending on where the incoming packets are coming from, I
    guess.





    s2.3, next to last para:







To remove all PLR addresses belonging to
   the encoded Address Family, an LSR N MUST encode PLR Status Value
   Element with no PLR entry and "Num PLR entry" field MUST be set to
   zero.





     This is inconsistent with the statement in the "PLR Status Value
    Element" figure:


                PLR entry (1 or more)


    I guess this last should be 'zero or more'.





    Nits/editorial comments:


    ====================== 


    Abstract: Need to expand LSR on first use.





    Abstract and  s1: s/that has been built by/that have been built by
    the/ (2 places)





    s1, para 1: Need to expand RSVP-TE and LFA on first use, and give
    references for them - probably RFC 5420 for RSVP-TE and RFC 5286 for
    LDP LFA.





    s1, para 2: s/Targetted/Targeted/





    s1.2: mLDP probably needs a reference - RFC 6388, I guess





    s1, para 2: the concept 'Target(t)ed LDP session' comes from RFC
    7060 - please add a reference.





    s2: It might be helpful to mention that the tLDP sessions are
    established over the bypass LSPs from PLR to MPT and reiterate the
    comments in s1 about how these LSPs might be established to avoid
    the protected node. 





    s2: The term 'root node' (presumably from RFC 6388) should be in the
    terminology section.





    s2: s/e.g./e.g.,/





    s2.1, s2.2, s5: The use of phrases such as 'we are ...ing' is
    deprecated for RFCs - please rephrase using, for example, 'this
    document <does something>'.





    s2.1, Fig 1 caption: s/to the LSR2/to LSR2/





    s2.1: 'which is feature enabled':  Although it is probably obvious
    it would be clearer to say 'which has the node protection feature
    enabled' 





    s2.1: s/i.e./i.e.,/





    s2.1: s/for Capability negotiation/during Capability negotiation/





    s2.2, last para: s/don't/doesn't/, s/will help/will help in/





    s2.3, para 1: s/with MP status TLV/with an MP status TLV (see
    Section 5 of [RFC6388])/





    s2.3: 'Length' and 'Num PLR entry' fields need a type (presumably
    'unsigned integer')





    s2.3, "Address Family": s/Address Family/Addr Family/ (for
    consistency with Figure); also add a reference for the correct IANA
    registry
(

http://www.iana.org/assignments/address-family-numbers/address-family-numbers.xhtml

)





    s2.3, "Num PLR entry": 


    OLD:


          Element, followed by "Num PLR entry" field (please see format
    of a


          PLR entry below).


    NEW:


          Element as an unsigned, non-zero integer followed by that
    number of  "PLR entry" fields in the format specified below.


    END


    (but see issue above, regarding non-zero)





    s2.3, "PLR Entry", s/must be zero/MUST be zero/ (I assume). (also
    applies to Reserved fields in s5.4)





    s2.3, 3rd last para: s/is depending on/is dependent on/





    s2.3, next to last para:


    OLD:


    by setting "A bit"


       to 1 or 0 respectively in corresponding PLR entry.


    NEW:


    by setting the "A bit"


       to 1 or 0 respectively in the corresponding PLR entry.


    END





    s2.3, last para:  The term "MP FEC TLV" is not defined explicitly
    anywhere.  I take it from RFC 6388 that it is (probably) a FEC TLV
    [RFC5036] with a P2MP FEC Element in it.  Please give references and
    make it clear what is implied.





    s3, para 1: s/PLR MP Status/PLR Status/ (there is no such thing as a
    PLR MP Status.)





    s3, "Length": Needs the type defined (unsigned integer presumably);
    also s/(for Address/for Address/





    s3, "Address Family": s/Address Family/Addr Family/ (for consistency
    with Figure); also add a reference for the correct IANA registry
(

http://www.iana.org/assignments/address-family-numbers/address-family-numbers.xhtml

)





    s3, last para: s/Typical/Typically/





    s4, para after Fig 3:


    The English in this para needs considerable attention. Suggestion:


    OLD:




   Assume that LSR1 is the PLR for protected node N, LSR2 and LSR3 are
   MPTs for node N. When LSR1 discovered that node N is unreachable, it
   can't determine whether it is the 'LSR1 - N' link or node N that
   failed.  In Figure 3, the link between LSR1 and N is also protected
   using Fast ReRoute (FRR) [

RFC4090

] link protection via node M. LSR1
   MAY potentially invoke 2 protection mechanisms at the same time,
   redirection the traffic due to link protection via node M to N, and
   for node protection directly to LSR1 and LSR2.  If only the link
   failed, LSR2 and LSR3 will receive the packets twice due to the two
   protection mechanisms.  To prevent duplicate packets to be forwarded
   to the receivers on the tree, LSR2 and LSR3 need to determin which
   upstream node to accept the packets from.  So, either from the
   primary upstream LSR N or from the secondary upstream LSR1, but never
   both at the same time.  The selection between the primary upstream
   LSR or (one or more) secondary upstream LSRs (on LSR2 and LSR3) is
   based on the reachability of the protected node N. As long as N is
   reachable, N is the primary upstream LSR who is accepting the MPLS
   packets and forwarding them.  Once N becomes unreachable, the
   secondary upstream LSRs (LSR1 in our example) are activated.  Note
   that detecting if N is unreachable is a local decision and not
   spelled out in this draft.  Typical link failure or Bidirectional
   Forwarding Detection (BFD) can be used to determine and detect node
   unreachability.
NEW:
   Assume that LSR1 is the PLR for protected node N, LSR2 and LSR3 are
   MPTs for node N. When LSR1 discovers that node N is unreachable, it
   cannot immediately determine whether it is the link from LSR1 to N or the actual node N that
   has failed.  In Figure 3, the link between LSR1 and N is also protected
   using Fast ReRoute (FRR) [

RFC4090

] link protection via node M. LSR1
   MAY potentially invoke both protection mechanisms at the same time, that is
   redirection of the traffic using link protection via node M to N, and
   for node protection directly to LSR1 and LSR2.  If only the link
   failed, LSR2 and LSR3 will receive the packets twice due to the two
   protection mechanisms.  To prevent duplicate packets being forwarded
   to the receivers on the tree, LSR2 and LSR3 need to determine from which
   upstream node they should accept the packets.  This can be either from the
   primary upstream LSR N or from the secondary upstream LSR1, but never
   both at the same time.  The selection between the primary upstream
   LSR or (one or more) secondary upstream LSRs (on LSR2 and LSR3) is
   based on the reachability of the protected node N. As long as N is
   reachable from an MPT, the MPT should accept and forward the MPLS
   packets from N.  Once N becomes unreachable, the LSPs frpm 
   secondary upstream PLR LSRs (LSR1 in our example) are activated.  Note
   that detecting if N is unreachable is a local decision and not
   spelled out in this draft.  Typically link failure or Bidirectional
   Forwarding Detection (BFD) can be used to determine and detect node
   unreachability.
END

s4.1, caption of Fig 4: s/after N failure/after failure of node N/

s4.1, last para:
OLD:
   Assume that LSR1 has detected that Node N is unreachable and invoked
   both the Link Protection and Node Protection procedures as described
   in this draft.  LSR1 is acting as PLR and sending traffic over both
   the backup P2P LSP to node N (via M) and the P2P LSPs directly to
   LSR2 and LSR3, acting as MPT LSRs.  The sequence of events are
   depending on whether the link 'LSR1 - N' has failed or node N itself.
   The node's downsteam from the protected node (and participating in
   node protection) MUST have the capability to determin that the
   protected node became unreachable.  Otherwise the procedures below
   can not be applied.
NEW:
   Assume that LSR1 has detected that Node N is unreachable and invoked
   both the Link Protection and Node Protection procedures as described
   in this example.  LSR1 is acting as PLR and sending traffic over both
   the backup P2P LSP to node N (via M) and the P2P LSPs directly to
   LSR2 and LSR3, acting as MPT LSRs.  The sequence of events is
   dependent on whether the link from LSR1 to N has failed or node N itself.
   The nodes downstream from the protected node (and participating in
   node protection) MUST have the capability to determine that the
   protected node has become unreachable.  Otherwise the procedures below
   can not be applied.
END

s4.1.1: s/over the/over to the/; s/that where/that were/

s4.1.2: s/MUST sent a Label/MUST sen a Label/

s5.2, last para: s/the MPT capable/MPT capable/

s5.4, "Length": - unsigned integer again.

s5.4, P and M bits: Add 'Set to 1 to indicate...'