Skip to main content

Telechat Review of draft-ietf-bess-evpn-fast-df-recovery-09
review-ietf-bess-evpn-fast-df-recovery-09-rtgdir-telechat-robles-2024-08-19-00

Request Review of draft-ietf-bess-evpn-fast-df-recovery
Requested revision No specific revision (document currently at 12)
Type Telechat Review
Team Routing Area Directorate (rtgdir)
Deadline 2024-08-19
Requested 2024-08-02
Requested by Gunter Van de Velde
Authors Patrice Brissette , Ali Sajassi , Luc André Burdet , John Drake , Jorge Rabadan
I-D last updated 2025-05-31 (Latest revision 2024-11-20)
Completed reviews Rtgdir Early review of -07 by Adrian Farrel (diff)
Genart IETF Last Call review of -09 by Elwyn B. Davies (diff)
Opsdir IETF Last Call review of -10 by Tim Chown (diff)
Rtgdir Telechat review of -09 by Ines Robles (diff)
Iotdir Telechat review of -09 by Toerless Eckert (diff)
Intdir Telechat review of -09 by Dave Thaler (diff)
Comments
Document is to be sent for IESG review
Assignment Reviewer Ines Robles
State Completed
Request Telechat review on draft-ietf-bess-evpn-fast-df-recovery by Routing Area Directorate Assigned
Posted at https://mailarchive.ietf.org/arch/msg/rtg-dir/GMfpi8V-G2FJ5A7PstylkLPRCRY
Reviewed revision 09 (document currently at 12)
Result Not ready
Completed 2024-08-19
review-ietf-bess-evpn-fast-df-recovery-09-rtgdir-telechat-robles-2024-08-19-00
Routing Directorate review of draft-ietf-bess-evpn-fast-df-recovery-09

Summary:

The draft proposes enhancements to the DF (Designated Forwarder) election
process in EVPN, particularly to improve recovery times after failures of
Provider Edge (PE) devices. It introduces a mechanism for fast DF recovery
using clock synchronization between PEs through the concept of Service Carving
Time (SCT). The draft updates Section 2.1 of RFC8584.

Please consider the following comments/questions:

1- Section 2: What happens if synchronization fails or becomes unstable? What
happens if time synchronization between PEs fails entirely (e.g., if NTP/PTP
synchronization breaks down)? What fallback mechanisms exist if clocks are out
of sync?

2- Section 2.2: What about: "Upon receiving a RECV_ES message, the peering
PE's..." --> "Upon receiving a RECV_ES message (indicating a change in the
Ethernet Segment), the peering PE's..."?

3- What about adding an operational section, following RFC 5706?

4- How should the skew value be configured based on network conditions, such as
varying latencies between PEs?

5- Section 5: What constitutes an "unreasonably large" versus a "reasonably
large" SCT? Maybe adding more text on that distinction would prevent
inconsistency in how different vendors handle invalid timestamps.

6- What are the security aspects of the uni-directional signaling approach?

7- How should scenarios be handled where failures (e.g., misconfiguration of
SCT) occur asymmetrically, such as partial PE failures where certain VLANs or
services are impacted while others are not?

Thanks for this document.

Ines.