Skip to main content

Early Review of draft-ietf-bess-evpn-fast-df-recovery-07
review-ietf-bess-evpn-fast-df-recovery-07-rtgdir-early-farrel-2023-05-27-00

Request Review of draft-ietf-bess-evpn-fast-df-recovery
Requested revision No specific revision (document currently at 08)
Type Early Review
Team Routing Area Directorate (rtgdir)
Deadline 2023-06-30
Requested 2023-05-25
Requested by Matthew Bocci
Authors Patrice Brissette , Ali Sajassi , Luc André Burdet , John Drake , Jorge Rabadan
I-D last updated 2023-05-27
Completed reviews Rtgdir Early review of -07 by Adrian Farrel (diff)
Comments
The draft has been through WG LC in BESS. Please can the RTG Dir review it before progressing the document.
Assignment Reviewer Adrian Farrel
State Completed
Request Early review on draft-ietf-bess-evpn-fast-df-recovery by Routing Area Directorate Assigned
Posted at https://mailarchive.ietf.org/arch/msg/rtg-dir/h3_4HLn6-dTEAm7qumBqWxDRjdE
Reviewed revision 07 (document currently at 08)
Result Has issues
Completed 2023-05-27
review-ietf-bess-evpn-fast-df-recovery-07-rtgdir-early-farrel-2023-05-27-00
Hello

I have been selected to do a Routing Directorate early review of this draft..
https://datatracker.ietf.org/doc/draft-ietf-bess-evpn-fast-df-recovery. I
reviewed revision -07.

The Routing Directorate will, on request from the working group chair, perform
an early review of a draft before it is submitted for publication to the IESG.
The early review can be performed at any time during the draft's lifetime as a
working group document. The purpose of the early review depends on the stage
that the document has reached.

I reviewed revision -07 of this draft after the completion of WG last call and
shepherd review, but before the WG makes a publication request. The purpose of
a review at this stage is to improve the quality before AD review and so reduce
the AD workload as wellas improve the output of the working group and the final
quality of any published RFCs.

For more information about the Routing Directorate, please see
https://wiki.ietf.org/en/group/rtg/RtgDir

Document: draft-ietf-bess-evpn-fast-df-recovery-07.txt
Reviewer: Adrian Farrel
Review Date: 2023-05-27
Intended Status: Standards Track

Summary:

I have some minor concerns about this document that I think should be resolved
before it is submitted to the AD.

This document is short and readable. While most of what I found is nits, they
somewhat detract from the clarity of the document. The minor points could do
with small additions to the text to help the reader and smooth the document's
passage through the IESG.

===

Abstract should note that this document updates RFC 8584 and say
(briefly) how. (Note that idnits warned you about this.)

I would put similar text in the Introduction, but that text can have a
pointer to Section 2.2).

I note that the topic of "updates" was raised by matthew in his review,
and that 2.2 is clear about the changes. I note, however, that the
decision to make this an Update happened after WG last call, and I
wonder whether the owrking group is fully on board with the implication
that any future implementation claiming conformance to 8584 will be
required to implement this specification. There is a difference between
an Update and an additional procedure.

---

Abstract

Move "(DF)" to the first use of "Designated Forwarder"

Please expand "EVI" and "PE"

s/via a simple signaling/via simple signaling/

---

Move the requirements language into a new Section 1.1

Please use the correct version of the boilerplate text...

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

---

1.

Please expand "DF", "EVI", and "PE" on first use.

s/applying Highest/applying the Highest/
s/independent of number/independent of the number/
s/via a simple signaling/via simple signaling/
s/on simple one-way signaling/on a simple one-way signaling/

---

1.2

s/in redundancy group/in a redundancy group/

---

1.2

The RFC Editor will ask you to consider another term in place of
"blackholing". You might choose to retain the term, or you might
think that it is OK to say "packets being dropped if the timer is
too long"

There are quite a few similar uses throughout the document.

---

1.2

   upon re-entry of the packet

I think you mean, "upon the packet re-entering the Ethernet Segment"

---

1.3.

This section is a bit messy because it talks about the "proposed
solution" in text that will eventually become an RFC, and because it
makes cryptic references to mechanisms that have not yet been described.

I am not convinced that you actually need this text (why are you trying
to sell the benefits having already told us there is a problem to be
solved?) but if you want to keep the text, I would suggest that you
position it as "design principles" and scale it right back. Something
like...

1.3.  Design Priniciples for a Solution

   The solution presented in this document follows several design
   pronciples as follows.

   *  Complicated handshake signamling mechanisms and state machines are
      avoided in favor of a simple uni-directional signaling approach.

   *  The solution is backwards-compatible (see Section 4).

   *  Existing DF Election algorithms are supported.

   *  The solution is independent of any BGP delays in propagation of
      Ethernet Segment routes (Route Type 4).

   *  The solution is agnostic of the actual time synchronization
      mechanism used.

---

2.

s/participating to a common/participating in a common/
s/at a same pre-announced time/at the same pre-announced time/
s/e.g.  NTP/e.g., NTP/
s/along with Ethernet/along with the Ethernet/
s/from local PE/from the local PE/
s/is called "Service/is called the "Service/
s/by newly added/by the newly added/

--

2.

What is "carving state"? I see that "service carving" is a term of art
in RFCs 7432 and 8584, but there is no mention there of "carving state".

---

2.

OLD
   to communicate to
   other partners the Service Carving Time
NEW
   to communicate the
   Service Carving Time to other partners
END

---

2.

   When a
   new PE is inserted or an existing PE device

Unqualified verb: a new PE is inserted where? Probably "inserted in an
Ethernet segment".

Incomplete clause: an existing PE device is what? Possibly "booted up"
in line with the text after Figure 1.

---

2.

   Upon reception of that new BGP Extended Community, partner PEs can
   determine exactly the anticipated carving time.

s/reception/receipt/

Why "exactly"? Is there some doubt that other mechanisms might not be
exact?
How "anticipated"? Is this still guesswork?
Maybe just...

   ...can determine the carving time.

---

2.1

s/needs to be defined/is defined/
s/along with Ethernet/along with the Ethernet/
s/as a 8-octet/as an 8-octet/
s/the NTP protocol/NTP/

---

2.1

   *  Timestamp Fractional Seconds: 16 bits of the NTP fractional
      seconds are encoded in this field.  The use of a 16-bit fractional
      seconds yields adequate precision of 15 microseconds (2^-16 s).

Which 16 bits? I think "the high order 16 bits"

---

2.1

Do you intend this solution to have a "bad time" when the NTP era
transitions in 2036? Or do you expect the implementation to handle NTP
wrapping? If so, you should probably point this out (because otherwise I
can see some fun in the field).

---

2.1

                        1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type = 0x06   | Sub-Type(0x06)| RSV |  DF Alg | |A| |T|       ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~     Bitmap    |            Reserved = 0                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   *  Bit 3: Time Synchronization (corresponds to Bit 27 of the DF
      Election Extended Community).  When set to 1, it indicates the
      desire to use Time Synchronization capability with the rest of the
      PEs in the Ethernet Segment.

Why bit 3? I know the answer is "because that's the bit our
implementation uses" and I'm fine with that and I'm not asking for any
change, but it makes me suspicious about what happened to bits 0 and 2.
Why aren't they used? Is someone squatting on them?

---

2.2

       9.1  Where SCT timestamp is present on the RECV_ES event of
            Action 11, wait the remaining time before continuing to 9.2.

You could usefully clarify "the remaining time".  I think you are saying
that the implementation should "wait until the time indicated by the
timestamp in the  (with the possible reduction by the skew value) is
in the past". This would also cover you against the transmission of
timestamps that are already in the past or which go into the past when
the skew is deducted.

---

2.2

Why do you show:

   10. DF_DONE on exiting the state: If a new DF election is triggered
       and the current DF is lost, then assume an NDF for the local PE
       for the VLAN or VLAN Bundle.

   11. DF_DONE on VLAN_CHANGE, RCVD_ES, or LOST_ES: Transition to
       DF_CALC.

As far as I can see, these are unchanged from 8584/2.1 and showing them
in this section is confusing.

---

What's a "peering timer"? I think you have assigned a name to the timer
described in step 2 of section 8.5 of RFC 7432. No reason not to give it
a name, but perhaps you should make it nice and clear what you are
talking about?

So...

In the examples in section 3 (thanks for these), it looks like the value
of the SCT and the value of the peering timer are coincidentally "now+3"
and 3 respectively. In fact they are both based on the value of the
timer configured for all PEs on the Ethernet Segment. This could be
clearer.

---

4.

   PEs
   running a baseline DF election mechanism will simply discard the new
   SCT BGP extended community as unrecognized.

Would be useful to give a reference for that behavior.

---

4.

   A PE can indicate its willingness to support clock-synched carving by
   signaling the new 'T' DF Election Capability as well as including the
   new Service Carving Time BGP extended community along with the
   Ethernet Segment Route (Type-4).  In the case where one or more PEs
   attached to the Ethernet Segment do not signal T=1, all PEs in the
   Ethernet Segment SHALL revert back to the [RFC7432] timer approach.
   This is especially important in the context of the VLAN shuffling
   with more than 2 PEs.

I see some challenges here.

The first is when a PE joins the segment while DF election is ongoing
among the other PEs.

The second is what happens if the SCT BGP extended community is present
when the T-bit is not set.

The third is what happens if the T-bit is set but no SCT BGP extended
community is provided.

All of these are easy to describe.

---

5.

Is there a "downgrade attck" achieved by causing a T-bit to be clear?

---

5.

   This document uses MPLS and IP-based tunnel
   technologies to support data plane transport.  Security
   considerations described in [RFC7432] and in [RFC8365] are equally
   applicable.

I don't think this document is concerned with the data plane transport.

On the other hand, you should describe the impact of any attacks on the
new protocol elements (for example, what happens if the SCT is set to a
value that is many hours ahead?). Of course, your escape clause remains
that the message must be protected in the same way that other messages
(in 7432 and 8584) are protected, but you should still describe the
risks.

---

6.

Looks like the SCT extended community has already been assigned. So, you
need...

OLD
   This document solicits the allocation of the following sub-type in
   the "EVPN Extended Community Sub-Types" registry setup by [RFC7153]:

         0x0F     Service Carving Timestamp    This document
NEW
   IANA maintains the "EVPN Extended Community Sub-Types" registry set
   up by [RFC7153].  IANA is requested to confirm the First Come First
   Served assignment as follows:

   Sub-Type Value | Name                      | Reference     | DATE
   ---------------+---------------------------+---------------+------
         0x0F     | Service Carving Timestamp | This document | TBD

   IANA should replace the field TBD with the date of publicaton of this
   document as an RFC.

END

---

6.

Just a little political correctness...

OLD
   This document solicits the allocation of the following values in the
   "DF Election Capabilities" registry setup by [RFC8584]:

         Bit         Name                             Reference
         ----        ----------------                 -------------
         3           Time Synchronization             This document
NEW
   IANA maintains the "DF Election Capabilities" registry set up by
   [RFC8584].  IANA is requested to make the following assignment from
   this registry:

         Bit     Name                         Reference         Date
         ----    ----------------             -------------     ----
         3       Time Synchronization         This document     TBD

   IANA should replace the field TBD with the date of publicaton of this
   document as an RFC.
END