Benchmarking Methodology for Link-State IGP Data-Plane Route Convergence
Note: This ballot was opened for revision 23 and is now closed.
(David Ward) Discuss
Discuss (2007-07-17 for -)
A few comments on this draft: 0) it is unclear why RIP isn't covered 1) it is unclear why the recommended timer values in 3.2.4 do not correspond to typical values configured in the network but, the route scaling numbers do 2) the packet sampling time in 3.2.5 seems out of date. IGPs converge faster than the packet sample time today. 3) it is recommended that the results are in usecs only 4) it is unclear if there are packet order test requirements for ECMP paths. Since the many ECMP tests are called out is there any 'correct' test or outcome that is desired for selection of ECMP path? 5) On bcast interfaces it is unclear if both p2p/nbma and bcast needs to be configured 6) why don't the tests include generation of LSA/LSP as well as change in data plane. IOW, what is SUT/DUT in Fig2 as well as 4.1.3. 7) The results from the remote failure case 4.1.3 aren't quite correct: "The additional convergence time contributed by LSP Propagation can be obtained by subtracting the Rate-Derived Convergence Time measured in 4.1.2 (Convergence Due to Neighbor Interface Failure) from the Rate-Derived Convergence Time measured in this test case." Though the point is mostly academic, it isn't technically correct. 8) why don't we include fiber pull test and/or enable disable interface 9) Do we want to specify tests for "link up" vs just "link down?" Link up is a critical event in the network and frequently causes loops/microloops. 10) In the "metric change" test in 4.5 since traffic is moving from one intf to another there should be an observable convergence event unlike what is stated in expected results: "There should be no externally observable IGP Route Convergence ..." 11) In general the results section of each tests should state what should be observed during the test, packet loss, packets tx/rx between any SUT and required systems, etc. Right now, there is a very brief description of the influencing variables of the test. It would not be possible to verify a true positive on a test w/ current text. 12) There needs to be a notion and testing of specific prefixes: first, last and then a median and mean 13) There needs to be a notion of important prefixes or those that are biased for prioritized convergence. E.g. BGP Nexthops. 14) There should be a measure of any microloops formed and duration of any loops|microloops 15) Is graceful restart and time to restore FIB considered a convergence event? If not, why?
(Ron Bonica) Yes
(Jari Arkko) No Objection
(Stewart Bryant) (was Discuss) No Objection
This is much improved since the previous version, and the RFC Editor's not addresses my remaining concerns.
(Ross Callon) No Objection
(Gonzalo Camarillo) No Objection
(Ralph Droms) No Objection
(Lisa Dusseault) No Objection
(Lars Eggert) No Objection
(Adrian Farrel) (was Discuss) No Objection
Section 1 The test cases in this document are black-box tests that emulate the network events that cause convergence, as described in [Po09a]. Do the event cause convergence or necessitate convergence? --- Please also address the large number of minor issues raised in the Routing Area Directorate review from Julien Meuric as follows... Hello, I have been selected as the Routing Directorate reviewer for this draft. The Routing Directorate seeks to review all routing or routing-related drafts as they pass through IETF last call and IESG review. The purpose of the review is to provide assistance to the Routing ADs. For more information about the Routing Directorate, please see http://www.ietf.org/iesg/directorate/routing.html Although these comments are primarily for the use of the Routing ADs, it would be helpful if you could consider them along with any other IETF Last Call comments that you receive, and strive to resolve them through discussion or by updating the draft. Document: draft-ietf-bmwg-igp-dataplane-conv-meth-21.txt Reviewer: Julien Meuric (with the help of an anonymous colleague eating IGPs at breakfast) Review Date: 06/30/2010 Intended Status: Informational *Summary:* I have some minor concerns about this document that I think should be resolved before publication. *Comments:* The document is rather heavy: it covers multiple scenarios, gives several sequences of testing actions, analyses details about uncertainty... As a result, for someone not used to the BMWG (please keep in mind that this is my 1st review on a document from BMWG) it is not so easy to follow in every detail and it requires some back-up reading (draft-ietf-bmwg-igp-dataplane-conv-term for instance). *Major Issues:* No major issues found. *Minor Issues:* --- 1/ I imagine it has already been discussed on the WG (sorry if I bring back a troll), but it seems unusual to use RFC 2119 language for an Informational document, and that is why it is explicitly stated in section 2. Considering the status remains the same, instead of advertising that fact, would not it be simpler to avoid the capital letters in the corresponding words? --- 2/ My GMPLS background brings me to think that an IGP adjacency may be independent from the corresponding data link. The document seems to focus on the classical IGP use, but it would be better to make that context clearer through a simple sentence than considering it is the default. --- 3/ There is unfortunately no reference to traffic-engineering extensions, while it might impact IGP convergence. Adding a few words on this so as to state it is out of scope (if so) would be welcome. --- 4/ By reading section 3, we understand that the causes considered for testing in this methodology concern failures and administrative changes (status, costs). Therefore, the link insertion/recovery is apparently not part of the testing. However, we can find it in section 8 if we take a close look to the procedure steps. As a consequence, in order to stay clearly consistent to draft-ietf-bmwg-igp-dataplane-conv-app-17 referenced here, it would be useful to clarify somewhere in section 3 that interface or link insertion/recovery is treated along with the failure events and is therefore taken into account. --- 5/ The document will also gain in stating from the introduction the scope of this methodology regarding router stress in front of convergence performance (i.e. what is addressed in section 5). For example, add something like: "Convergence performance is tightly linked to the number of tasks a router has to deal with. As the most impacting tasks are mainly related to the control plane and the data plane, the more the DUT is stressed as in a live environment, the more accurate performance results (i.e. the ones that would be observed in a live environment) will be. Section 5 gives detailson the recommended environment for IGP convergence performance benchmarking." --- *Nits:* --- Even though it may be usual in the WG, the way document references are built ("AuthID#") is much less readable than "Summarized-Title" as used in some places else. Let us hope most of them will be update with RFC numbers (not more convenient in fact, but stable reference). --- The phrase "next-hop router" may be confusing (at least until going into the details), especially because in some contexts like BGP, a next-hop router may not be adjacent but remote. How about "adjacent routers" to reuse IS-IS terminology or "neighbor routers" to reuse OSPF terminology? --- The "ECMP" acronym is expanded in section 3.4 (where it is actually tested) while it has been used since section 3.1: expansion should be moved (or duplicated) there. --- A mix of "Loss of connectivity" and "LoC" acronym are used alternatively: strict consistency along the document may not be a goal, but association between them should at least be explicit at 1st use (section 4). --- "IS-IS" is always referred to as "ISIS", I would add the dash. --- Some titles on figures (e.g. 9) and equations (e.g. 3) are closer to the following paragraph than the corresponding item, swapping or reducing the amount of blank lines would be easier to read. --- Section 2: s/in other BMWG work/in other documents issued by the Benchmarking Methodology Working Group/ --- Section 3.1: At 1st occurence, it might be more accurate to specify that "N >= 1" or "N > 0". --- Section 3.4: "the tester emulates N next-hop routers" Whitout the figure, it is difficult to quickly picture the configuration. I may ease the understanding by adding something like "(N-1 adjacent to R1; 1 adjacent to R2)". --- Section 5.4: "LSA", "LSP" and "SPF" are not expanded: they may be usual, but IGP is expanded in the abstract and introduction (and "LSP" has 2 usual meanings in the Routing area)... The same question may raise for "IS-IS" and "OSPF" expansion, but they are considered as "well-known" on http://www.rfc-editor.org/rfc-style-guide/abbrev.expansion.txt (while the formers are not). --- Section 5.6: s/topologies 3, 4, and 6/topologies 3, 4 and 6/ s/packets are transmitted/packets be transmitted/ --- Section 5.9: s/test case has/test case have/ --- Section 7: s/loss or not./loss or not?/ s/Complete the table below/The table below should be completed/ --- Section 8: "DUT's" and "Tester's" read weird to me with respect to what I was taught at school, but someone put "the car's wheels" on Wikipedia. I thus leave this issue to native English speakers. :-) --- Section 8.1.4: s/may influenced/may be influenced/ ---
(Stephen Farrell) No Objection
(Sam Hartman) No Objection
(Russ Housley) No Objection
(Cullen Jennings) No Objection
(Chris Newman) No Objection
(Tim Polk) No Objection
(Pete Resnick) No Objection
This document seems to be misusing RFC 2119 language. They don't seem to follow the admonition in section 6 of 2119: Imperatives of the type defined in this memo must be used with care and sparingly. In particular, they MUST only be used where it is actually required for interoperation or to limit behavior which has potential for causing harm (e.g., limiting retransmisssions) For example, they must not be used to try to impose a particular method on implementors where the method is not required for interoperability.
(Dan Romascanu) No Objection
Comment (2007-07-18 for -)
1. The Abstract says 'The methodology can be applied to any link-state IGP, such as ISIS and OSPF.' Is it true that the methodology applies only to link-state IGPs? If true, I would suggest that the title is change to add 'link-state'. Else strike out 'link-state' from the Abstract. 2. Section 3.2.2 - 'To obtain results similar to those that would be observed in an operational network, it is recommended that the number of installed routes closely approximates that the network.' Probably '... that of the network' An indication of the deegree of magnitude of this number also seems to be in place here. 3. Section 4.2 - what does 'remove layer 2 session' mean? I read layer 2 failure a failure that is detected at layer 2, but can reflect a fault that happens in the lower layer and can be as trivial as a cable failure. Am I wrong?