Skip to main content

Last Call Review of draft-ietf-lsr-dynamic-flooding-16
review-ietf-lsr-dynamic-flooding-16-genart-lc-enghardt-2024-02-28-00

Request Review of draft-ietf-lsr-dynamic-flooding
Requested revision No specific revision (document currently at 18)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2024-02-29
Requested 2024-02-15
Authors Tony Li , Peter Psenak , Huaimo Chen , Luay Jalil , Srinath Dontula
I-D last updated 2024-02-28
Completed reviews Genart Last Call review of -16 by Reese Enghardt (diff)
Secdir Last Call review of -16 by Chris M. Lonvick (diff)
Rtgdir Last Call review of -13 by Susan Hares (diff)
Assignment Reviewer Reese Enghardt
State Completed
Request Last Call review on draft-ietf-lsr-dynamic-flooding by General Area Review Team (Gen-ART) Assigned
Posted at https://mailarchive.ietf.org/arch/msg/gen-art/lMMEy3R0YhoQBP3SEphbUNw488U
Reviewed revision 16 (document currently at 18)
Result Ready w/nits
Completed 2024-02-28
review-ietf-lsr-dynamic-flooding-16-genart-lc-enghardt-2024-02-28-00
I am the assigned Gen-ART reviewer for this draft. The General Area
Review Team (Gen-ART) reviews all IETF documents being processed
by the IESG for the IETF Chair.  Please treat these comments just
like any other last call comments.

For more information, please see the FAQ at

<https://wiki.ietf.org/en/group/gen/GenArtFAQ>.

Document: draft-ietf-lsr-dynamic-flooding-16
Reviewer: Reese Enghardt
Review Date: 2024-02-28
IETF LC End Date: 2024-02-29
IESG Telechat date: Not scheduled for a telechat

Summary: Thank you for this comprehensive and well-reasoned document. This
seems like a compelling solution to a real problem. I have a few suggestions to
further improve clarity and readability.

Major issues: None.

Minor issues:

Introduction:
"However it is very clear that using an Exterior Gateway Protocol as an IGP is
sub-optimal, if only due to the configuration overhead." To me, the "very
clear" and the "if only due" sound like they're contradicting each other. If
it's very clear, I expect a very strong reason or multiple. Please consider
providing more reasons why an EGP is suboptimal or weakening the "very clear".

"The primary issue that is demonstrated when conventional mechanisms are
applied is the poor reaction of the network to topology changes." Please
consider clarifying what conventional mechanisms means specifically here.
Conventional IGPs? Link state protocol specifically? Are the two the same?

"This problem is not entirely surprising. Link state routing protocols were
conceived when links were very expensive and topologies were sparse. The fact
that those same designs are sub-optimal in a dense topology should not come as
a huge surprise. The fundamental premise that original designs addressed was an
environment of extreme cost and scarcity. Technology has progressed to the
point where links are cheap and common. This represents a complete reversal in
the economic fundamentals of network engineering" The text in and around this
part seems a bit redundant, please consider shortening it to say the surprise
part and the "link used to be costly" only once.

Section 3, Solution requirements:
"Changes to nodes outside of the dense subgraph are not acceptable. "
Please consider clarifying what "changes" means here - Is this to say that the
solution needs to be backwards-compatible and/or an extension to an existing
IGP?

Section 4, Dynamic flooding:
"New link state information that arrives from outside of the flooding topology
suggests that the sender has different or no flooding topology information and
that the link state update should be flooded on the flooding topology as well."
This part was not immediately obvious to me - "arrives from outside of the
flooding topology" means it arrives from outside the subgraph that supports
dynamic flooding and/or from a legacy device? Or does it mean that the
information arrives from within the flooding topology but over a link that is
not part of the flooding topology? Please consider clarifying this point.

"When centralized mode is used and if, during a transient, there are multiple
flooding topologies being advertised […]" The word "transient" is used as a
noun multiple times during the draft, so I assume this is a standing term in
routing which I have never heard before. Please consider adding a brief
explanation of what a transient is.

Section 4.1:
Does "legacy flooding" and "standard flooding" refer to the same thing? If so,
please consider unifying the terms here.

Section 4.3:
"If the leader chooses to include a multi-node broadcast LAN segment as part of
the flooding topology, all of the connectivity to that LAN segment should be
included as well." Please consider clarifying what "all the connectivity" means
here, as I think this is the first time LANs are discussed. Does it mean all
links that connect to the LAN segment? Are LANs with multiple links the same as
"multi-access LANs" referenced later in the document, in which case please
consider using a consistent term?

Section 4.4:
Please consider adding figures to help the explanation here.

Section 4.4.2:
"Adding one more leaf between the last and first spine will produce a cycle of
N spines and N leaves." Are these both intended to be N, or is one intended to
be M? Does the algorithm make any assumptions of how many spines and leaves
there are in total?

Section 4.5:
"Instead, we choose to encode the flooding topology as a set of intersecting
paths, where each path is a set of connected edges." I think this is the first
time the document mentions paths. Please consider briefly expanding on how a
path is defined, unless there is a clear consistent definition that is
well-known in the routing area in general. Are edges the same as links, in
which case please consider using consistent terms?

Section 4.6:
"As the flooding topology is defined by edges (not by links)"
This sentence implies these are different things. Please consider clarifying
what the difference is and/or providing a brief reasoning on why it's edges
here and not links.

Section 5.1:
When referencing IS-IS, please consider referencing a relevant RFC or such.
Please consider expanding LSP on first use.

"A TLV to advertise the list of system IDs that compose the flooding topology
for the area." Are system IDs the same as nodes here, in which case please
consider saying so?

Section 5.1.1:
"Due to transients during database flooding, different nodes may not agree on
the Area Leader." Is this a problem? If not, why not? If so, what is the best
way to handle this problem? Please consider discussing these points briefly.

Does each system just pick its own priority? Is it configured? Or how is
priority determined? Please consider discussing these points briefly.

Section 5.1.5:
Please consider expanding IIH PDU on first use.

Section 5.2:
When referencing OSPFv2 and OSPFv3, please consider referencing a relevant RFC
or such.

Section 6.1:
Please consider expanding SPT on first use.
Please consider expanding OL on first use.

Section 6.6.2.1:
Please consider expanding CSNP and PSNP on first use.

"NOTE: If any node connected to the LAN requests the enablement of temporary
flooding, all nodes revert to the standard flooding behavior." Should this
sentence be rephrased as a MUST?

Section 6.7:
"If a node determines that adjacencies are to be added to the flooding
topology, it should add those and begin flooding on those adjacencies
immediately." Does this apply to both centralized and distributed mode? Is this
temporary flooding or does it change the flooding topology? Please consider
clarifying this point.

Section 6.8.1:
Please consider expanding SRM on first use.

Section 6.8.9:
"Temporary flooding MUST be enabled towards a subset of the disconnected nodes
per the discussion in Section 6.8.12." Please consider referencing Section 6.7
as well.

Nits/editorial comments: None.