datatracker.ietf.org
Sign in
Version 5.7.1.p2, 2014-10-29
Report a bug

Benchmarking Methodology for Link-State IGP Data-Plane Route Convergence
draft-ietf-bmwg-igp-dataplane-conv-meth-23

Note: This ballot was opened for revision 12 and is now closed.

Summary: Needs a YES.

Adrian Farrel

Comment (2010-06-30 for -23)

Section 1

   The test cases in this document are black-box
   tests that emulate the network events that cause convergence, as
   described in [Po09a].

Do the event cause convergence or necessitate convergence?

---

Please also address the large number of minor issues raised in the Routing Area
Directorate review from Julien Meuric as follows...

Hello,

I have been selected as the Routing Directorate reviewer for this draft.
The Routing Directorate seeks to review all routing or routing-related
drafts as they pass through IETF last call and IESG review. The purpose
of the review is to provide assistance to the Routing ADs. For more
information about the Routing Directorate, please see
http://www.ietf.org/iesg/directorate/routing.html

Although these comments are primarily for the use of the Routing ADs, it
would be helpful if you could consider them along with any other IETF
Last Call comments that you receive, and strive to resolve them through
discussion or by updating the draft.

Document: draft-ietf-bmwg-igp-dataplane-conv-meth-21.txt
Reviewer: Julien Meuric (with the help of an anonymous colleague eating
IGPs at breakfast)
Review Date: 06/30/2010
Intended Status: Informational

*Summary:*
I have some minor concerns about this document that I think should be
resolved before publication.

*Comments:*
The document is rather heavy: it covers multiple scenarios, gives
several sequences of testing actions, analyses details about
uncertainty... As a result, for someone not used to the BMWG (please
keep in mind that this is my 1st review on a document from BMWG) it is
not so easy to follow in every detail and it requires some back-up
reading (draft-ietf-bmwg-igp-dataplane-conv-term for instance).

*Major Issues:*
No major issues found.

*Minor Issues:*
---
1/ I imagine it has already been discussed on the WG (sorry if I bring
back a troll), but it seems unusual to use RFC 2119 language for an
Informational document, and that is why it is explicitly stated in
section 2. Considering the status remains the same, instead of
advertising that fact, would not it be simpler to avoid the capital
letters in the corresponding words?
---
2/ My GMPLS background brings me to think that an IGP adjacency may be
independent from the corresponding data link. The document seems to
focus on the classical IGP use, but it would be better to make that
context clearer through a simple sentence than considering it is the
default.
---
3/ There is unfortunately no reference to traffic-engineering
extensions, while it might impact IGP convergence. Adding a few words on
this so as to state it is out of scope (if so) would be welcome.
---
4/ By reading section 3, we understand that the causes considered for
testing in this methodology concern failures and administrative changes
(status, costs). Therefore, the link insertion/recovery is apparently
not part of the testing. However, we can find it in section 8 if we take
a close look to the procedure steps. As a consequence, in order to stay
clearly consistent to draft-ietf-bmwg-igp-dataplane-conv-app-17
referenced here, it would be useful to clarify somewhere in section 3
that interface or link insertion/recovery is treated along with the
failure events and is therefore taken into account.
---
5/ The document will also gain in stating from the introduction the
scope of this methodology regarding router stress in front of
convergence performance (i.e. what is addressed in section 5). For
example, add something like:
"Convergence performance is tightly linked to the number of tasks a
router has to deal with. As the most impacting tasks are mainly related
to the control plane and the data plane, the more the DUT is stressed as
in a live environment, the more accurate performance results (i.e. the
ones that would be observed in a live environment) will be. Section 5
gives detailson the recommended environment for IGP convergence
performance benchmarking."
---

*Nits:*
---
Even though it may be usual in the WG, the way document references are
built ("AuthID#") is much less readable than "Summarized-Title" as used
in some places else. Let us hope most of them will be update with RFC
numbers (not more convenient in fact, but stable reference).
---
The phrase "next-hop router" may be confusing (at least until going into
the details), especially because in some contexts like BGP, a next-hop
router may not be adjacent but remote. How about "adjacent routers" to
reuse IS-IS terminology or "neighbor routers" to reuse OSPF terminology?
---
The "ECMP" acronym is expanded in section 3.4 (where it is actually
tested) while it has been used since section 3.1: expansion should be
moved (or duplicated) there.
---
A mix of "Loss of connectivity" and "LoC" acronym are used
alternatively: strict consistency along the document may not be a goal,
but association between them should at least be explicit at 1st use
(section 4).
---
"IS-IS" is always referred to as "ISIS", I would add the dash.
---
Some titles on figures (e.g. 9) and equations (e.g. 3) are closer to the
following paragraph than the corresponding item, swapping or reducing
the amount of blank lines would be easier to read.
---
Section 2:
s/in other BMWG work/in other documents issued by the Benchmarking
Methodology Working Group/
---
Section 3.1:
At 1st occurence, it might be more accurate to specify that "N >= 1" or
"N > 0".
---
Section 3.4:
"the tester emulates N next-hop routers"
Whitout the figure, it is difficult to quickly picture the
configuration. I may ease the understanding by adding something like
"(N-1 adjacent to R1; 1 adjacent to R2)".
---
Section 5.4: "LSA", "LSP" and "SPF" are not expanded: they may be usual,
but IGP is expanded in the abstract and introduction (and "LSP" has 2
usual meanings in the Routing area)... The same question may raise for
"IS-IS" and "OSPF" expansion, but they are considered as "well-known" on
http://www.rfc-editor.org/rfc-style-guide/abbrev.expansion.txt (while
the formers are not).
---
Section 5.6:
s/topologies 3, 4, and 6/topologies 3, 4 and 6/
s/packets are transmitted/packets be transmitted/
---
Section 5.9:
s/test case has/test case have/
---
Section 7:
s/loss or not./loss or not?/
s/Complete the table below/The table below should be completed/
---
Section 8:
"DUT's" and "Tester's" read weird to me with respect to what I was
taught at school, but someone put "the car's wheels" on Wikipedia. I
thus leave this issue to native English speakers. :-)
---
Section 8.1.4:
s/may influenced/may be influenced/
---

Pete Resnick

Comment (2011-08-25 for -)

This document seems to be misusing RFC 2119 language. They don't seem to follow
the admonition in section 6 of 2119:

   Imperatives of the type defined in this memo must be used with care
   and sparingly.  In particular, they MUST only be used where it is
   actually required for interoperation or to limit behavior which has
   potential for causing harm (e.g., limiting retransmisssions)  For
   example, they must not be used to try to impose a particular method
   on implementors where the method is not required for
   interoperability.

[Dan Romascanu]

Comment (2007-07-18 for -)

1. The Abstract says 'The methodology can be applied to any link-state IGP,
such as ISIS and OSPF.' Is it true that the methodology applies only to
link-state IGPs? If true, I would suggest that the title is change to add
'link-state'. Else strike out 'link-state' from the Abstract.

2. Section 3.2.2 - 'To obtain results similar to those that would be
   observed in an operational network, it is recommended that the
   number of installed routes closely approximates that the network.'

Probably '... that of the network'

An indication of the deegree of magnitude of this number also seems to be in
place here.

3. Section 4.2 - what does 'remove layer 2 session' mean? I read layer 2
failure a failure that is detected at layer 2, but can reflect a fault that
happens in the lower layer and can be as trivial as a cable failure. Am I wrong?

[David Ward]

Discuss (2007-07-17 for -)

A few comments on this draft:

0) it is unclear why RIP isn't covered

1) it is unclear why the recommended timer values in 3.2.4 do not correspond to
typical values configured in the network but, the route scaling numbers do

2) the packet sampling time in 3.2.5 seems out of date. IGPs converge faster
than the packet sample time today.

3) it is recommended that the results are in usecs only

4) it is unclear if there are packet order test requirements for ECMP paths.
Since the many ECMP tests are called out is there any 'correct' test or outcome
that is desired for selection of ECMP path?

5) On bcast interfaces it is unclear if both p2p/nbma and bcast needs to be
configured

6) why don't the tests include generation of LSA/LSP as well as change in data
plane. IOW, what is SUT/DUT in Fig2 as well as 4.1.3.

7) The results from the remote failure case 4.1.3 aren't quite correct:

"The additional
   convergence time contributed by LSP Propagation can be
   obtained by subtracting the Rate-Derived Convergence Time
   measured in 4.1.2 (Convergence Due to Neighbor Interface
   Failure) from the Rate-Derived Convergence Time measured in
   this test case."

Though the point is mostly academic, it isn't technically correct.

8) why don't we include fiber pull test and/or enable disable interface

9) Do we want to specify tests for "link up" vs just "link down?" Link up is a
critical event in the network and frequently causes loops/microloops.

10) In the "metric change" test in 4.5 since traffic is moving from one intf to
another there should be an observable convergence event unlike what is stated
in expected results:

"There should be no externally observable IGP Route Convergence ..."

11) In general the results section of each tests should state what should be
observed during the test, packet loss, packets tx/rx between any SUT and
required systems, etc. Right now, there is a very brief description of the
influencing variables of the test. It would not be possible to verify a true
positive on a test w/ current text.

12) There needs to be a notion and testing of specific prefixes:

first, last and then a median and mean

13) There needs to be a notion of important prefixes or those that are biased
for prioritized convergence. E.g. BGP Nexthops.

14) There should be a measure of any microloops formed and duration of any
loops|microloops

15) Is graceful restart and time to restore FIB considered a convergence event?
If not, why?

[Stewart Bryant]

Comment (2011-08-15 for -23)

This is much improved since the previous version, and the RFC Editor's not
addresses my remaining concerns.