Benchmarking Terminology for Protection Performance
Note: This ballot was opened for revision 08 and is now closed.
( Ron Bonica ) Yes
( Dan Romascanu ) Yes
Comment (2010-06-17 for -)
It looks to me that 'Benchmarking Terminology for Performance of sub-IP layer Protection Mechanisms' would be a more apropriate name for this document.
Jari Arkko No Objection
( Stewart Bryant ) No Objection
Comment (2010-06-16 for -)
I am not sure that RFC2119 language is appropriate here Figures 1 through 5 show models that MAY be used when benchmarking Sub-IP Protection mechanisms, which MUST use a Protection Switching System that consists of a minimum of two Protection-Switching Nodes, an Ingress Node known as the Headend Node and an Egress Node known as the Merge Node. Ideally the RFC2119 definition should appear before first use in the document The Protection Switching System MUST include either a Primary Path and Backup Path, as shown in Figures 1 through 4, or a Primary Node and Standby Node, as shown in Figure 5. I am not a mathematician, can an equation have a pluarity of sub-equations, or is a different mathematical construct needed? TBLM as shown in Equation 2: (Equation 2) (Equation 2a) TBLM Failover Time = Time(Failover) - Time(Failover Event) (Equation 2b) TBLM Reversion Time = Time(Reversion) - Time(Restoration) Same with Eq3
( Gonzalo Camarillo ) No Objection
( Lars Eggert ) (was Discuss) No Objection
Comment (2010-06-14 for -09)
> Benchmarking Terminology > for Protection Performance As stated in the abstract, Section 3.5 actually defines some tests. The title should reflect that this document is not only defining terminology. Section 3.1., paragraph 4: > b. Ri is a node which forwards data frames to R[i+1] over Link > Li[i+1] for all i, 1<i<n, based on information in the sub-IP > layer. Since R1/L12 are defined in (a) and (c) defines Rn, you probably want to change 1<i<n to 1<i<n-1, so that Rn isn't defined twice. (And then you need to define L(n-1)n in (c)). (I also wonder why you define these sets of links and nodes when they aren't used anymore in the remainder of the document.)
( Adrian Farrel ) No Objection
Comment (2010-06-16 for -)
I have a huge raft of issues and concerns about this document. In the end I raised a COMMENT not a DISCUSS because I don't believe that the publication of this document as an Informational RFC without making the changes will be harmful. However, attending to these comments would, I believe significantly improve the value of your work. The responsible AD may decide that the comments need to be addressed. --- idnits (http://tools.ietf.org/tools/idnits/) throws up a number of issues. Although many of these are minor or editorial, it would have made the document easier to read had you fixed them. I think that some of the boilerplate issues are more significant and that the I-D cannot be accepted unless a new revision with the correct boilerplate is submitted. The document writeup says: (1.g) Has the Document Shepherd personally verified that the document satisfies all ID nits? (See http://www.ietf.org/ID-Checklist.html and http://tools.ietf.org/tools/idnits/). Boilerplate checks are not enough; this check needs to be thorough. Has the document met all formal review criteria it needs to, such as the MIB Doctor, media type and URI type reviews? There are a few errors in the nits: http://tools.ietf.org/idnits?url=http://tools.ietf.org/id/draft-ietf-bmwg-protection-term-08.txt These can be corrected after AD-review, but there will be some work necessary to adopt the new boilerplate, update references, fix page lengths, etc. Clearly this did not happen, and I really think we need to push back on document shepherds to understand that the only acceptable answer to the question is "Yes, idnits passes cleanly". --- There is a significant different between the document title ("Benchmarking Terminology for Protection Performance") and the content of the document as explained in the Abstract and Introduction. Viz. This document provides common terminology and metrics for benchmarking the performance of sub-IP layer protection mechanisms. 1. The title should mention metrics 2. The title should indicate the scope is sub-IP --- The Introduction starts off with... Technologies that function at sub-IP layers can be enabled to provide further protection of IP traffic by providing the failure recovery at the sub-IP layers so that the outage is not observed at the IP-layer. This seems a reasonable statement, but isn't it in contradiction with the whole premise of this document as stated in the Abstract... The performance benchmarks are measured at the IP-Layer, avoiding dependence on specific sub-IP protection mechanisms. In other words, if the outage is not observed at the IP-layer, how will you measure the benchmarks at the IP-Layer? --- I would really have liked this work to be aligned with RFC 4427. That document sets out terminology for protection and restoration in GMPLS networks (i.e. sub-IP) and attempts to achieve alignment with ITU-T Recommendations G.808.1 and G.841. It is important to use identical rather than similar terms for protection and restoration techniques across the IETF when we are referring to sub-IP layers. Actually, I find the terminology rather woolly. For example, in Section 1 The Working Path is the Primary Path prior to the Failover Event and the Backup Path after the Failover Event. Yet in Section 3.1.3 The Primary Path is the Path that traffic traverses prior to a Failover Event. And it is difficult to read a document where the terms are used well in advance of their definitions. It might help if Section 3 was split with sections 3.1 through 3.4 presented as terminology, and sections 3.5 onwards as Test Considerations. The definition of "Restoration" in section 3.3.5 is unlike anything I have seen before and is very much at odds with the way the term is used in sub-IP networks. --- Figure 4 There is a spurious arrow head on the left-hand end of the arrow marked "IP-Layer Forwarding" --- Section 3.1.1 a. R1 is the ingress node and forwards IP packets, which input into DUT/SUT, to R2 as sub-IP frames over link L12. b. Ri is a node which forwards data frames to R[i+1] over Link Li[i+1] for all i, 1<i<n, based on information in the sub-IP layer. I don't think you should assume anything about the encapsulation method or transport mechanisms in the sub-IP technology. What is a sub-IP frame in packet over WDM? In many technologies, node Ri does not forward sub-IP frames. It forwards a signal. --- Section 3.1.1 "A bidirectional path", which transmits traffic in both directions along the same nodes, consists of two unidirectional paths. Therefore, the two unidirectional paths belonging to "one bidirectional path" will be treated independently when benchmarking for "a bidirectional path". Doesn't that mean that there will be different observed performance in the IP layer according to whether bidirecitonal or unidirecitonal protection is enabled in the sub-IP layer? But you are saying that you will not distinguish the two cases and so you will report benchmarks incorrectly. --- Section 3.1.3 Definition: The preferred path for forwarding traffic between two or more nodes. This is (possibly) the only mention of p2mp sub-IP paths. Is this really in scope for your draft? --- Section 3.1.5 The Backup Path MUST be created prior to the Failover Event. This seems to rule out the use of restoration (in the stadard transport network sense of the word) in the sub-IP network. Why do you rule this out? In fact, section 3.1.7 seems to contradict this statement. --- Section 3.1.5 Is the list intended to be exhaustive? This seems to have forgotten 1:1 protection. --- Section 3.1.5 The backup path generally originates at the point of failure, and terminates at a node along a primary path. I don't find the term "point of failure" in the rest of the document, but it seems to me this is wrong. It the backup path starts at the point of failure (for example, a failed node), what use will it be? --- Section 3.1.8 3.1.8. Disjoint Paths Definition: A pair of paths that do not share a common link. Discussion: Two paths are disjoint if they do not share a common node other than the ingress and egress. While these two paragraphs are compatible it is not clear that the Discussion is simply observing that node-disjoint paths are necessarily link-disjoint. Usually, one talks about link-disjoint paths and node-disjoint paths. --- Section 3.1.9 You use the term "penultimate egress node" which may be a bit confusing. After all, there is only one egress node. --- Section 3.1.10 Definition: SRLG is a set of links which share a physical resource. This is a debatable definition! What is a physical resource? Does it include a power supply? What about separate ducts in the same bridge? Or transit of the same politically unstable country? The last line of the Discussion seems to be the really key bit. Discussion: SRLG is considered the set of links to be avoided when the primary and secondary paths are considered disjoint. The SRLG will fail as a group if the shared resource fails. --- Section 3.2 I am not really comfortable with the split of "link protection" and "local link protection", but it I see how it may be interesting for benchmarking. I don't understand why there isn't a concept of "local node protection" where the failure of node C on the path ABCDE is protected by a direct link BD. --- Section 3.2.3 Path Protection provides Node Protection and Link Protection for every node and link along the Primary Path. A Backup Path providing Path Protection MUST have the same ingress node as the Primary Path. I don't see why you rule out a backup path that is not fully link and node diverse --- Section 3.2.7 The State Control Interface MAY be used for Redundant Node Protection. The State Control Interface MUST be out-of-band. It is possible to have Redundant Node Protection in which there is no state control or state control is provided in-band. This is hard to parse. It appears to say that stat control must be provided out of band. And yet it also says it is possible for it to be in band. --- Section 3.5 The following Benchmarks MAY be assessed on a per-flow basis using at least 16 flows spread over the routing table (more flows is better). Isn't this SHOULD?
( Russ Housley ) No Objection
( Tim Polk ) No Objection
Comment (2010-06-15 for -)
I suspect this is clear enough for its intended audience, but a more detailed description of the calculations in 3.6.1 and 3.6.3 would help novice readers. I couldn't quite sort out the equations. For example in 3.6.3 (TBM), are the following interpretations correct? (1) Time(Failover) = Timestamp on first unimpaired packet received at egress node after the backup path became the working (2) Time(Failover Event) = Timestamp on the last unimpaired packet received at egress node on the primary path before failure I was unable to construct similar statements for the TBLM in 3.6.1 at all. Nits: in section 3.1.1, there is a confusing mixture of notation, using parentheses and brackets interchangeably. The definition of path uses parentheses, as in "L(n-1)n", but the description of node in subitem b. uses brackets, as in "Li[i+1]". In 3.6.3 under discussion, the comma following "observation of unimpaired packets" should be a period.