Skip to main content

Clustered Alternate-Marking Method
draft-ietf-ippm-rfc8889bis-04

Yes

(Martin Duke)

No Objection

Erik Kline
(Andrew Alston)
(Robert Wilton)

Note: This ballot was opened for revision 02 and is now closed.

Erik Kline
No Objection
John Scudder
(was Discuss) No Objection
Comment (2022-08-18 for -03) Sent
Thanks for addressing my DISCUSS, I've cleared it and retained the text below for posterity.

Previous DISCUSS:

Thanks for this document. As you may have noticed I had considerable difficulty with the definition of "cluster". Once I completed an end-to-end read-through, this was resolved (good!) but because RFCs are often consumed piecemeal (e.g. someone may just dip into a portion of a document rather than settling down with a nice cup of tea to read it end-to-end), I think it's important to fix this problem, on the assumption I'm not the only person who might be thrown off.

I'll leave the details in the COMMENT, but I will repeat one observation from my comment #3, which is that I count at least four separate (re-)definitions of "cluster" in the document. With so many, it's no wonder that they're inconsistent, and quite possibly the simplest solution would involve cutting the number of definitions down to as close to 1 as possible.

COMMENT:

1. In §1, this

                                  While this document and its Clustered
   Alternate-Marking method is valid for multipoint-to-multipoint
   unicast flows, anycast, and ECMP flows.
   
This is a sentence fragment (a naked subordinate clause). I'd suggest a rewrite but I can't make out what you are trying to say.

2. You have multiple definitions of "cluster" in the document, it seems. In §2 you have,

      Cluster: Smallest identifiable subnetwork of the entire monitoring
      network graph that still satisfies the condition that the number
      of packets that go in is the same as the number that go out.  A
      cluster partition algorithm can be applied to split the monitoring
      network into clusters.

As we have discussed in the past, according to this definition, a cluster is always a single router. I think in an earlier discussion you suggested that it could be corrected by saying something like "smallest subnetwork larger than a singleton router", which I think would work, "smallest non-trivial" would too (although it's less explicit). But then in §5 you have,

   In addition, it is also possible to leverage the data provided by the
   other counters in the network to converge on the smallest
   identifiable subnetworks where the losses occur.  These subnetworks
   are named "clusters".

As written, this makes nonzero losses a definitional attribute of a cluster. Now that I've taken in the entire document, I guess this is the incorrect definition. Possibly a fix would be to just drop the final sentence, which (incorrectly?) implies that losses are an essential element of the definition.

For what it's worth, the algorithm of Section 5.1 is perfectly clear. I see the desire to have a prose description of what a cluster is trying to do, but I wonder if in the end it would be best to make the algorithm the canonical definition, referenced from the prose definition as in 

      Cluster: Smallest identifiable non-trivial 
      subnetwork of the entire monitoring
      network graph that still satisfies the condition that the number
      of packets that go in is the same as the number that go out.  A
      cluster partition algorithm, such as that found in Section 5.1,
      can be applied to split the monitoring
      network into clusters.

3. In §5, you have

   A cluster graph is a subnetwork of the entire monitoring network
   graph that still satisfies the packet loss equation (introduced in
   the previous section), where PL in this case is the number of packets
   lost in the cluster.

I'd previously pointed out that this is problematic since the PL equation you mention is in the nature of a definition; since you don't supply a condition there's no way of saying whether it's "satisfied" or not. In response you said (https://mailarchive.ietf.org/arch/msg/ippm/iXSBuXrQaETl6MyeDH8DBaPiJ9Q/),

```
We can modify the wording in Section 5 accordingly:

"A cluster graph is a subnetwork of the entire monitoring network graph that still satisfies the condition that the number of packets that go in is the same as the number that go out, if no packet loss occurs."

```

If indeed that is the right definition (see my earlier point, so probably inserting "non-trivial" or "with more than a single router") then I do think you should make that change, or in any case you should correct the current text somehow.

By the way, do you mean something subtly different by "cluster graph" than you do by "cluster"? Or do you just mean "the graph that represents the cluster"? I think my confusion over just what a "cluster" is would have been mitigated by fewer re-definitions of it. If "cluster graph" is indeed just another way of saying "cluster" it's the third of four definitions! (The two I mention in my point #2, this one, and the algorithm in Section 5.1.) With four definitions of the same thing, in different words, no wonder some inconsistency crept in!

4. In §5.1,

   In summary, once a flow is defined, the algorithm to build the
   clusters partition is based on topological information; therefore, it
   considers all the possible links and nodes crossed by the given flow,
   even if there is no traffic.
   
If a flow has no traffic, can we call it a flow? It seems counter to the normal English meaning of the word. Perhaps you mean possible links and nodes that could potentially be crossed by the given flow?

5. Thanks for reporting the outcome of the earlier experiment, in §9. As with my review of rfc8321bis, I think the RFC 2119 keywords need to be in a "Deployment Considerations" section or similar, which could be achieved by retitling this section or by separating the material into reporting the experiment outcome (this section) and a new subsection for deployment advice.

6. Again similarly to my review of the companion document, I think §9's "the Multipoint Alternate Marking Method is RECOMMENDED only for controlled domains" needs to be fixed. Please bring it into line with whatever language you choose for rfc8321bis.




Nits:

7. While the reference tags are in principle arbitrary strings, I wondered if it was a typo that you used "PNPM" to tag a paper titled "AM-PM"?

8. In §7.1.1,

                 This means that, in the calculation, it is possible to
   weigh the timestamps by considering the number of packets for each
   endpoints.
   
s/endpoints/endpoint/
Murray Kucherawy
No Objection
Comment (2022-09-21 for -03) Sent
Thanks for the work on this.  It was pretty easy to follow.

In Section 5, the document suddenly starts using the term "arc".  What's an arc?

Two problems in Section 9.  The first:

      One flag: packet loss measurement MUST be done as described in
      Section 6 by applying the network clustering partition described
      in Section 5.  While delay measurement MUST be done according to
      the Mean delay calculation representative of the multipoint path,
      as described in Section 7.1.1.  Single-marking method based on the
      first/last packet of the interval cannot be applied, as mentioned
      in Section 7.2.1.

The "While delay ..." sentence seems to be incomplete.  Should it be attached (via comma) to the sentence before it, or is there something missing here?

The same problem occurs in the next ("Two flags:") paragraph.

The second problem is the SHOULD in that same paragraph.  SHOULD presents the implementer with a choice, and I suggest adding some prose here to explain why one might legitimately decide to do something other than what the SHOULD says.

Lastly, some nits:

In Section 5:

   In addition, it is also possible to leverage [...]

You don't need both "In addition" and "also".

In Section 7.2.1:

      Double marking or multiplexed marking work but only through
      statistical means.  [...]

s/work/works/

   If it is performed a delay measurement for more than one [...]

Suggest "If a delay measurement is performed for more than one ...".

In Section 9:

   ... there is different kind of information that can be derived.

Missing "a", I believe?

      For example, to get measurements on a multipoint-paths
      basis, one flag can be used.  While, to get measurements on a
      single-packet basis, two flags are preferred.

Suggest removing "While".
Paul Wouters
No Objection
Comment (2022-07-14 for -02) Not sent
I support Roman's discuss
Roman Danyliw
(was Discuss) No Objection
Comment (2022-08-05 for -03) Sent for earlier
Thank you for addressing my DISCUSS point.
Zaheduzzaman Sarker
No Objection
Comment (2022-07-13 for -02) Sent
Thanks for working on this specification. 

I am supporting Roman, John, Alvaro's discuss.

I have following comments which I believe would improve the document quality, if addressed.

## Section 2: I really didn't get the definition of the flow. May be it is my lack of expertise in English language or the technology used here but when the definition of flow is defined by flow ( is says- a flow can be multipoint-to-multipoint flow) it becomes very difficult to understand. Are we defining flow or identification of flow? It would be great to get a better description of the definition of  "flow". I also agree with John's comment that if a flow does not have traffic, is this also a flow? May be we merely defining the link between network nodes as flow, then I would say the term flow is overloaded.

Actually section 3 has a much better definition of flow. why don't we use that? 

       "a flow can be defined by a set of selection rules used to
   match a subset of the packets processed by the network device.  These
   rules specify a set of Layer 3 and Layer 4 header fields
   (identification fields) and the relative values that must be found in
   matching packets" 

## Section 2: according to the definition of the cluster, any single node can be a cluster and measure of the that cluster will be same as point to point measure defined in rfc8321bis (like node packet loss). is that correct interpretation? if yes, rfc8321bis becomes a subset of this specification, is that the intention?

## Section 2.1 : as group to group segments are not used to define cluster in this specification, I would suggest following change -

       OLD text: the spatial metrics, used for measuring the performance of
       segments of a source to destination path, are applied here to
       group-to-group segments (called clusters).

       NEW text : the spatial metrics, used for measuring the performance of
       segments of a source to destination path, are applied here to
       clusters.

## Section 4.2 : What is "Non-initial fragments"?

## I think it is confusing when terms defined in the terminology section get redefined in the later sections in a different way than how it is defined in the terminology section. the term "cluster" is an example of such case. can we stick to one single definition of cluster and other terms used in this document?
Éric Vyncke
No Objection
Comment (2022-09-20 for -03) Sent
Thanks for the work done in this document.

Please find below some non-blocking COMMENT points; a reply will be appreciated.

Regards

-éric

## COMMENTS

### I-D Name

The name of the I-D is quite confusing "multipoint-to-multipoint"... I would have really preferred to use "cluster" in the title and abstract.

### Cluster

About "cluster", I second John's original comment about the "fuzziness" of the cluster definition.

### Section 5

"multipoint flows" is also weird as usually a flow is between 2 entities. Suggest to use "flow aggregate"

### Cluster and NAT

A cluster is only defined as packets in == packets out. But, if 'flow' is defined per IP addresses or 5-tuple, those properties must also be invariant, i.e., neither IPv4 NAPT nor IPv6 NPT nor encapsulation/decapsulation. This seems obvious, but this should be mentioned.

### Section 9 

The title is about 'recommendations', but the text contains 'MUST' and not 'IS RECOMMENDED'. Suggest to change either the section title or the text itself.
Martin Duke Former IESG member
Yes
Yes (for -02) Unknown

                            
Alvaro Retana Former IESG member
(was Discuss) No Objection
No Objection (2022-08-02 for -03) Sent for earlier
[Thanks for addressing my DISCUSS.]
Andrew Alston Former IESG member
No Objection
No Objection (for -03) Not sent

                            
Lars Eggert Former IESG member
(was Discuss) No Objection
No Objection (2022-07-12 for -03) Sent for earlier
## Comments

### Section 3, paragraph 15
```
     The case of unicast flow is considered in Figure 1.  The anycast flow
     is also in scope, because there is no replication and only a single
     node from the anycast group receives the traffic, so it can be viewed
     as a special case of unicast flow. 
```
Anycast is only a special case of a unicast flow is routing is stable throughout
the measurement period.

### Section 4.2, paragraph 0
```
  4.2.  Network Packet Loss
```
The definitions in this section seem to assume that routing is stable during the
measurement period (because otherwise the loss definitions don't work in case of
route changes or loops). That assumption should probably be made more explicit.

### Section 5, paragraph 5
```
     In a completely monitored unidirectional network (a network where
     every network interface is monitored), each network device
     corresponds to a cluster, and each physical link corresponds to two
     clusters (one for each device).
```
What is a "unidirectional network"?

### Section 5.1, paragraph 0
```
  5.1.  Algorithm for Clusters Partition
```
It would be helpful if this algorithm was also specified as pseudo code, rather
than just by example.

### Section 7.1.1, paragraph 2
```
     The average latency can be measured as the difference between the
     weighted averages of the mean timestamps of the sets of output and
     input nodes.  This means that, in the calculation, it is possible to
     weigh the timestamps by considering the number of packets for each
     endpoints.
```
Wouldn't you need to require that the weight is exactly the number of packets to
make the math work, and not just a weight that "considers" the number of
packets, i.e., is an unspecified function of it??

### Section 7.2.2, paragraph 2
```
     The hash-based selection methodologies for delay measurement can work
     in a multipoint-to-multipoint path and MAY be used either coupled to
     mean delay or stand-alone.
```
I think the MAY needs to be a MUST, because otherwise unspecified alternatives
are permitted?

### Section 7.2.2, paragraph 4
```
     In a multipoint environment, the hashing selection MAY be the
     solution for performing delay measurements on specific packets and
     overcoming the single- and double-marking limitations.
```
I don't understand why this uses an RFC2119 MAY?

### Section 9, paragraph 2
```
     Either one or two flag bits might be available for marking in
     different deployments:
```
The use of SHOULD and MAY in the three sections following this is leaving things
underspecified. What happens if SHOULDs and MAYs are not followed?

## Nits

All comments below are about very minor potential issues that you may choose to
address in some way - or ignore - as you see fit. Some were flagged by
automated tools (via https://github.com/larseggert/ietf-reviewtool), so there
will likely be some false positives. There is no need to let me know what you
did with these suggestions.

### Boilerplate

Document still refers to the "Simplified BSD License", which was corrected in
the TLP on September 21, 2021. It should instead refer to the "Revised BSD
License".

### Grammar/style

#### Section 1, paragraph 6
```
 of problems (packet loss is measured or the delay is too high), the filterin
                                     ^^^
```
Use a comma before "or" if it connects two independent clauses (unless they are
closely connected and short).

#### Section 1, paragraph 8
```
DISCUSS: Note that the fragmented packets case can be managed with the Alter
                                  ^^^^^^^
```
An apostrophe may be missing.

#### Section 4.2, paragraph 3
```
 algorithm was also specified as pseudo code, rather than just by example. A
                                 ^^^^^^^^^^^
```
This word is normally spelled as one.

#### Section 5.1, paragraph 13
```
 defined in Section 4.2, valid for the an entire monitored flow, can easily
                                   ^^^^^^
```
Two determiners in a row. Choose either "the" or "an".

#### Section 5.1, paragraph 25
```
ng the number of packets for each endpoints. Wouldn't you need to require th
                                  ^^^^^^^^^
```
The noun should probably be in the singular form.

#### Section 5.1, paragraph 32
```
 can do the calculation. If we would perform a delay measurement for more th
                               ^^^^^^^^^^^^^
```
Consider removing "would". (Usually, "would" does not occur in a conditional
clause, unless to make a request or give a polite order.).

Also, the document uses first person plural ("we") in several places, which is
unusual in an RFC.

#### Section 7, paragraph 1
```
with the hashing selection allows to choose a simplified hash function since
                                  ^^^^^^^^^
```
Did you mean "choosing"? Or maybe you should add a pronoun? In active voice,
"allow" + "to" takes an object, usually a pronoun.

#### Section 7, paragraph 3
```
along the path. For example, in case an hashed sample is lost, it is confined
                                     ^^
```
Use "a" instead of "an" if the following word doesn't start with a vowel sound,
e.g. "a sentence", "a university".

#### Section 8, paragraph 4
```
of necessity (packet loss is measured or the delay is too high), the filterin
                                     ^^^
```
Use a comma before "or" if it connects two independent clauses (unless they are
closely connected and short).

## Notes

This review is in the ["IETF Comments" Markdown format][ICMF], You can use the
[`ietf-comments` tool][ICT] to automatically convert this review into
individual GitHub issues. Review generated by the [`ietf-reviewtool`][IRT].

[ICMF]: https://github.com/mnot/ietf-comments/blob/main/format.md
[ICT]: https://github.com/mnot/ietf-comments
[IRT]: https://github.com/larseggert/ietf-reviewtool
Robert Wilton Former IESG member
No Objection
No Objection (for -02) Not sent