Summary: Has a DISCUSS. Has enough positions to pass once DISCUSS positions are resolved.
A) Section 8. Method of Measurement I think the metrics are fine, what makes me quite worried here is the measurement method. My concerns with it are the following. 1. The application of this measurement method is not clearly scoped. Therefore I will assume that across the Internet measurements are possible. However in that context I think the definition and protection against severe congestion has significant short comings. The main reason is that the during a configurable time period (default 1 s) the sender will attempt to send at a specified rate by a table independently on what happens during that second. 2. The algorithm for adjusting rate is table driven but give no guidance on how to construct the table and limitations on value changes in the table. In addition the algorithm discusses larger steps in the table without any reflection of what theses steps sides may represent in offered load. 3. Third the algorithms reaction to any sequence number gaps is dependent on delay and how it is related to unspecified delay thresholds. Also no text discussion how these thresholds should be configured for safe operation. B) Section 8. Method of Measurement There are no specification of the measurement protocol here that provides sequence numbers, and the feedback channel as well as the control channel. Is this intended to use TWAMP? From my perspective this document defines the metrics on standards track level. However, the method for actually running the measurements are not specified on a standards track level. No one can build implementation. And if the section is intended to provide requirements on a protocol that performs these measurements I think several aspects are missing. There appear several ways forward here to resolve this; one is to split out the method of measurement and define it separately to standard tracks level using a particular protocol, another is to write it purely as requirements on a measurement protocols.
Thank you to Rifaat Shekh-Yusef for the SECDIR review.
Section 3 5. Less emphasis on ISP gateway measurements, possibly due to less traffic crossing ISP gateways in future. nit: sentence fragment. Section 5 This section sets requirements for the following components to support the Maximum IP-layer Capacity Metric. editorial/nit: I don't think I understand what "the following components" are. Is this referencing some preexisting template for a metric definition? (Same for §6 and §7.) Section 5.3 The number of these IP-layer bits is designated n0[dtn,dtn+1] for a specific dt. (editorial) I'm a little confused by this notation for n0. In Section 4 we say that "dtn" references a specific sub-interval, but the brackets and two items look like they should be the start an end of an interval, for which perhaps just the 'n' would be more appropriate (but we'd want a half-open interval, and would need to worry about 0- vs 1-indexing for n, etc.). Anticipating a Sample of Singletons, the interval dt SHOULD be set to a natural number m so that T+I = T + m*dt with dtn+1 - dtn = dt and with 1 <= n <= m. nit: "dt [...] set to m" doesn't make any sense. nit: this looks like more of a "for 1 <= n <= m" than an "and with". Mathematically, this definition can be represented as: ( n0[dtn,dtn+1] ) C(T,dt,PM) = ------------------------- dt (nit) the "n" appears (in 'dtn') only on the right side but there is no operation applied over it, so this "equation" seems unbalanced without also specifying 'n'. I think the introductory text should mention something about "for each n" or for a given interval dtn" or similar. o n0 is the total number of IP-layer header and payload bits that can be transmitted in Standard Formed packets [RFC8468] from the (nit) RFC 8468 spells it "standard-formed". o C(T,dt,PM) the IP-Layer Capacity, corresponds to the value of n0 measured in any sub-interval ending at dtn (meaning T + n*dt), divided by the length of sub-interval, dt. If it's supposed to be "ending at dtn", then why is the "dtn+1" in the picture at all? o PM represents other performance metrics [see section 5.4 below]; their measurement results SHALL be collected during measurement of IP-layer Capacity and associated with the corresponding dtn for further evaluation and reporting. (nit) this seems to duplicate (be a subset of) the paragraph before "Mathematically, this definition can be represented". o The bit rate of the physical interface of the measurement device must be higher than that of the link whose C(T,I,PM) is to be measured. (nit) I thought we were measuring a path, not a link. Section 5.4 RTD[dtn-1,dtn] is defined as a sample of the [RFC2681] Round-trip Do we really want to be using n0[dtn,dtn+1] but RTD[dtn-1,dtn]? Can we pick a consistent sub-interval notation? Section 6.3 Define the Maximum IP-layer capacity, Maximum_C(T,I,PM), to be the maximum number of IP-layer bits n0[dtn,dtn+1] that can be transmitted in packets from the Src host and correctly received by the Dst host, (nit) the relevant formulae include a dt divisor, but I don't see anything in this prose that would correspond to such a divisor. The interval dt SHOULD be set to a natural number m so that T+I = T + m*dt with dtn+1 - dtn = dt and with 1 <= n <= m. nit: "dt [...] set to m" doesn't make any sense. nit: this looks like more of a "for 1 <= n <= m" than an "and with". Mathematically, this definition can be represented as: max ( n0[dtn,dtn+1] ) [T,T+I] Maximum_C(T,I,PM) = ------------------------- dt where: T T+I _________________________________________ | | | | | | | | | | | dtn=1 2 3 4 5 6 7 8 9 10 n+1 n=m (nit) as mentioned previously, the definition of "dtn" lists it as being the sub-interval, not the boundary point/time of a sub-interval. Section 6.5 If traffic conditioning (e.g., shaping, policing) applies along a path for which Maximum_C(T,I,PM) is to be determined, different values for dt SHOULD be picked and measurements be executed during multiple intervals [T, T+I]. A single constant interval dt SHOULD be chosen so that is an integer multiple of increasing values k times serialisation delay of a path MTU at the physical interface speed where traffic conditioning is expected. [...] nit: "so that is an integer multiple" seems to be missing a word. Also, this seems to say that different values for dt SHOULD be picked, but also that a constant dt SHOULD be chosen. How can those both be recommended and be consistent with each other? (I mean, I assume that the intent is to multiple runs with different (fixed) dt, but that's not what the text seems to say.) Section 7.3 Define the IP-layer Sender Bit Rate, B(S,st), to be the number of IP- layer bits (including header and data fields) that are transmitted from the Source during one contiguous sub-interval, st, during the test interval S (where S SHALL be longer than I), and where the fixed-size packet count during that single sub-interval st also provides the number of IP-layer bits in any interval: n0[stn-1,stn]. (1) there doesn't seem to be any restriction that the observed packets list Dst as the destination address, so formally it seems this would count *all* traffic generated by Sender, not just the traffic relevant for the (path capacity) test. (2) It seems a little unfortunate that we reuse the 'n0' symbol here for a different meaning than in the earlier capacity metrics. Measurements according to these definitions SHALL use the UDP transport layer. Any feedback from Dst host to Src host received by Src host during an interval [stn-1,stn] MUST NOT result in an adaptation of the Src host traffic conditioning during this interval (rate adjustment occurs on st boundaries). Hmm, this "MUST NOT" is interesting, as it seems to imply extremely tight coordination between the measurement point for this metric and the Source itself. (Note that the toplevel §7 admits the possibility that measurement will occur at a location other than the Src host to network path interface, via "(or as close as practical)".) Section 8.1 At the beginning of a test, the sender begins sending at rate R1 and the receiver starts a feedback timer at interval F (while awaiting It's a little hard to search for, but I didn't find any previous mention of 'F' or it being defined as a parameter or term. Should it be a listed parameter somewhere? If the feedback indicates that sequence number anomalies were detected OR the delay range was above the upper threshold, the offered load rate is decreased. Also, if congestion is now confirmed by the current feedback message being processed, then the offered load rate is decreased by more than one rate (e.g., Rx-30). [...] Does "congestion is now confirmed" mean that "congestion confirmed" is like a one-way latch and this transition only occurs at most once over the course of a test? Or could the Rx-30 happen multiple times? (The pseudocode indicates the former.) If the feedback indicates that there were no sequence number anomalies AND the delay range was above the lower threshold, but below the upper threshold, the offered load rate is not changed. The way this is written suggests that there will always be a lower and an upper threshold for delay, but the rest of the document so far didn't give me that impression. E.g., we talk about PM only as "at least one fundamental metric and target performance threshold MUST be supplied", and to me having both upper and lower thresholds would be two thresholds, not one. Section 8.2 Here, as with any Active Capacity test, the test duration must be kept short. 10 second tests for each direction of transmission are common today. The default measurement interval specified here is I = 10 seconds). In combination with a fast search method and user- network coordination, the concerns raised in RFC 6815[RFC6815] are alleviated. [...] I skimmed RFC 6815 and had a bit of a hard time making the connection for why combining a 10-second interval, fast search method, and user-network coordination alleviate the concerns of RFC 6815. There doesn't seem to be much in 6815 itself about how testing in production can be done safely, so my current working assumption is that the conclusion presented here reflects the results of "new work" being recorded for the first time (in the RFC series) in this document. If that assumption is correct, I'd suggest spending some more words to support the conclusion, e.g., making analogies to other "normal" traffic patterns and how the benchmarking setup is not qualitatively different from them. Section 8.3 As testing continues, implementers should expect some evolution in the methods. The ITU-T has published a Supplement (60) to the Y-series of Recommendations, "Interpreting ITU-T Y.1540 maximum IP- layer capacity measurements", [Y.Sup60], which is the result of continued testing with the metric and method described here. I pulled up the [Y.Sup60] reference, and it does not seem to reference this draft by name. On what basis do we conclude that it "is the result of continued testing with the metric and method described here"? Skimming/searching, I do see many similar formulae and methods presented, but how do we conclude they are precisely the same? Section 10 Should we say something about making sure that I is reasonably bounded? IIRC we say so elsewhere in the text but not exactly here. 2. A REQUIRED user client-initiated setup handshake between cooperating hosts and allows firewalls to control inbound unsolicited UDP which either go to a control port [expected and w/authentication] or to ephemeral ports that are only created as needed. [...] nit: the grammar is odd in the first part of this sentence; the part before the "and" doesn't seem like it can join up with anything after the "and". Is the intent something like "It is REQUIRED to have a user client-initiated setup handshake between cooperating hosts that allows firewalls to [...]"? 3. Integrity protection for feedback messages conveying measurements is RECOMMENDED. (In some sense you want authentication as well as integrity protection.) 5. Senders MUST be rate-limited. This can be accomplished using the pre-built table defining all the offered load rates that will be supported (Section 8.1). The recommended load-control search algorithm results in "ramp up" from the lowest rate in the table. nit: since (effectively) each implementation will have their own pre-built table, I think it should be "using a pre-built table". Appendix 13 If we start at Rx (row) 1, is it going to cause problems when we drop down to Rx = 0 in the loss/congestion cases? The mechcanism in the pseudocode to stop taking large increments in sending rate above the "hSpeedThresh" does not seem to be described in the prose in §8.1. (That said, it seems like a good idea, given the likely table composition.) (Also, indenting one tab for the outer conditionals and two more for the inner ones looks a bit unusual.) Section 14 It's not entirely clear to me why RFC 2330 is classified as normative but RFC 7312 is informative, just based on the locations where they are referenced.
[[ comments ]] [ section 4 ] * RFC 6438 isn't only about flow label treatment at Tunnel End Points, since other devices in the network can do ECMP where the flow label is part of the flow. Maybe just a full stop after "when routers have complied with RFC6438 guidelines."? [[ questions ]] [ sections 5.6/6.6/7.4 ] * Should the "megabit = 1,000,000 bits" text from section 6.6 be also used in sections 5.6 and 7.4, or even called out separately earlier on? (Maybe it's just my own experience, but I've found that, while "mibi" 100% of the time means base 2, "mega" only mostly means base 10 and occasionally can still be interpreted as base 2.) [[ nits ]] [ section 2 ] * "Also, to foster the development..." This appears to be a sentence fragment rather than a grammatically correct sentence. I assume the point is to say that fostering the development ... is also goal of the document. * "The supporting" -> "Supporting the", might read more naturally? [ section 5.4 ] * "The statistics used to to summarize" -> "The statistics used to summarize" [ section 6.5 ] * "so that is" -> "so that it is"?
Since a SHOULD leaves an implementer with a choice, it's preferable to see prose explaining why one might deviate from the SHOULD advice. Thus, the SHOULDs in Sections 5.3 and 6.3 leave me wondering under what circumstances an implementer might legitimately choose to do something else. If there are none, should it be a MUST?
I'm stealing Robert's ballot text, because it perfectly explains my position/views: "Thank you for this document, I found it interesting to read. I have no comments that haven't already been raised by other AD reviews. "
Thank you for this document, I found it interesting to read. I have no comments that haven't already been raised by other AD reviews.