Skip to main content

Simple Two-Way Active Measurement Protocol Extensions for Performance Measurement on LAG
draft-ietf-ippm-stamp-on-lag-06

Yes

(Martin Duke)

No Objection

Erik Kline
Francesca Palombini
Jim Guichard
Murray Kucherawy
(Andrew Alston)

Note: This ballot was opened for revision 05 and is now closed.

Warren Kumari
Yes
Comment (2023-11-29 for -05) Sent
Thank you for writing this - this seems like a very useful document, solving a real issue.

I have have a number of comments and suggestions to try and make the document even better / clearer:

1: "Usually, when forwarding traffic over LAG, the hash-based mechanism is used to load balance the traffic across the LAG member links."
I suggest "when forwarding traffic over LAG, a hash-based mechanism is used" or "hash-based mechanisms are used" -- everyone has their own hash based mechanism...

2: "Link delay of each member link varies because of different transport paths." -- "The link delay might vary between member links because..." -- the majority of LAGs (citation needed!) are single hop Ethernets between switches/routers, and have (for all intents and purposes) identical delay.  

3: "OWAMP [RFC4656] and TWAMP [RFC5357] are two active measurement methods according to the classification given in [RFC7799], which can complement passive and hybrid methods." I found this sentence hard to parse. Perhaps something like "According to the classifications in [RFC7799], OWAMP [RFC4656] and TWAMP [RFC5357] are active measurement methods, and they can complement passive and hybrid methods."

4: "This document intends to address the scenario (e.g., Figure 1) where a LAG (e.g., the LAG includes four member links) directly connects two nodes (A and B)."
This was another tricky (for me) to parse sentence. Perhaps "This document intends to address the scenario directly connects two nodes (A and B). An example of this is in Figure 1, where the LAG consisting of four links connects nodes A and B." would be clearer?

Again, I think that this is a useful document, these are just readability suggestions...
Erik Kline
No Objection
Francesca Palombini
No Objection
Jim Guichard
No Objection
John Scudder
No Objection
Comment (2023-11-30 for -05) Sent
- I support Zahed's DISCUSS position.

- Thanks in particular for the clearly explained use case in the introduction of both this and its companion document.
Murray Kucherawy
No Objection
Paul Wouters
No Objection
Comment (2023-11-29 for -05) Sent
        This document intends to address

"Do, there is no try" :)  eg you "This documents addresses ....".

        Test packets MAY carry the member link information for validation check, which is RECOMMENDED.

MAY RECOMMENDED is novel invention :) I recommend to use "test packets are RECOMMENDED to carry ..."


Section 3.1: Should it state the two octet fields are in network order?

Operational Considerations: The text does not mention if the testing may occur on a production
link (with non-test data), or whether the link is seperated from the production link to obtain
more accurate measurements? That is, should there be an Operational Considerations section that
talks about how/when to run these tests?
Roman Danyliw
No Objection
Comment (2023-11-16 for -05) Sent
** Section 2.
   All micro sessions of a LAG share the same Sender IP Address and
   Receiver IP Address of the LAG.  As for the UDP Port, the micro
   sessions may share the same Sender Port and Receiver Port pair, or
   each micro session is configured with a different Sender Port and
   Receiver Port pair.  But from the operational point of view, the
   former is simpler and is RECOMMENDED.

I made the same comment for draft-ietf-ippm-otwamp-on-lag-07.  Is there a reason to provide both options?  Why would one not choose the RECOMMENDED approach?  Could the reason for choosing one configuration over the other be documented?
Zaheduzzaman Sarker
(was Discuss) No Objection
Comment (2023-12-14) Sent
Thanks for addressing my discuss comment.
Éric Vyncke
No Objection
Comment (2023-11-21 for -05) Sent
# Éric Vyncke, INT AD, comments for draft-ietf-ippm-stamp-on-lag-05

Thank you for the work put into this document.

Please find below some non-blocking COMMENT points (but replies would be appreciated even if only for my own education), and some nits.

Special thanks to Marcus Ihlar for the shepherd's detailed write-up including the WG consensus and the justification of the intended status. 

Other thanks to Haoyu Song , the Internet directorate reviewer (at my request), please consider this int-dir review:
https://datatracker.ietf.org/doc/review-ietf-ippm-stamp-on-lag-05-intdir-telechat-song-2023-11-16/ (minor nits but both Haoyu and myself will appreciate a reply to the review -- it is optional though)

I hope that this review helps to improve the document,

Regards,

-éric

# COMMENTS (non-blocking)

## Section 1

Suggest using "may vary" in `Link delay of each member link varies because of different transport paths`, i.e., in most deployment that I know, the links are strictly on the same physical path.

`we need to explicitly steer the traffic across the LAG member links based on the link delay, loss and so on`, beside the use of "we" (ambiguous in the context as it is not the authors doing the steering) I also wonder whether there are really use cases for specific steering within a LAG group. This I-D has value outside the steering.

While I agree that ECMP use case is similar and *should* have been a formal part of this I-D, is there any added value for the reference to RFC 9256?

## Section 2

Please expand "PM" at first use (legend of Fig 1).

Same for "SSID" (especially as it is overloaded by Wi-Fi), suggest to also add a section in the reference to RFC 8972.

`But from the operational point of view, the former is simpler and is RECOMMENDED.`, AFAIK, the LAG load balancing is also spread by the IP addresses and layer-4 ports, so, how can the traffic be load balanced correctly among the links ? I.e., it is clearly easier for operation but what about the implementation ?

## Section 3.2

In `whether the reflected test packet is correctly transmitted over the expected member link` is it 'transmitted' (== sent) or 'received' ?

# NITS (non-blocking / cosmetic)

## Section 2

s/estabilished/established/

## Section 3

s/introdued/introduced/

I.e., running a spell checker before submitting a revised I-D could be useful ;-)
Martin Duke Former IESG member
Yes
Yes (for -05) Unknown

                            
Andrew Alston Former IESG member
No Objection
No Objection (for -05) Not sent

                            
Lars Eggert Former IESG member
No Objection
No Objection (2023-11-30 for -05) Sent
# GEN AD review of draft-ietf-ippm-stamp-on-lag-05

CC @larseggert

## Comments

### Section 3.2, paragraph 2
```
     The micro STAMP Session-Sender MUST send the micro STAMP-Test packets
     over the member link with which the session is associated.  The
     configuration and management of the mapping between a micro STAMP
     session and the Sender/Reflector member link identifiers are outside
     the scope of this document.
```
This is a pretty critical piece to the process though. At the very
least, I would have expected to see some text requiring at least one
session needing to be established per member link to utilize the LAG
fully. Are there other such considerations?

### Boilerplate

This document uses the RFC2119 keywords "SHALL", "NOT RECOMMENDED", "SHALL
NOT", "SHOULD NOT", "MUST", "RECOMMENDED", "OPTIONAL", "MAY", "SHOULD",
"REQUIRED", and "MUST NOT", but does not contain the recommended RFC8174
boilerplate. (It contains some text with a similar beginning.)

## Nits

All comments below are about very minor potential issues that you may choose to
address in some way - or ignore - as you see fit. Some were flagged by
automated tools (via https://github.com/larseggert/ietf-reviewtool), so there
will likely be some false positives. There is no need to let me know what you
did with these suggestions.

### Typos

#### Section 2, paragraph 4
```
-    in fact STAMP sessions estabilished on member links of a LAG, test
-                                -
```

#### Section 3, paragraph 1
```
-    Micro-session ID TLV introdued in this section for validation check.
+    Micro-session ID TLV introduced in this section for validation check.
+                                +
```

#### Section 3, paragraph 1
```
-    reveived from the expected member link.  It can also verify whether
-      ^
+    received from the expected member link.  It can also verify whether
+      ^
```

#### Section 6, paragraph 1
```
-    are directly connnected.  As such, it's assumed that a node involved
-                   -
```

### Grammar/style

#### Section 1, paragraph 1
```
rics of each member link of a LAG. Hence the measured performance metrics ca
                                   ^^^^^
```
A comma may be missing after the conjunctive/linking adverb "Hence".

#### Section 3.2, paragraph 6
```
ket, the micro Session-Sender uses the the Sender Micro-session ID to check w
                                   ^^^^^^^
```
Possible typo: you repeated a word.

## Notes

This review is in the ["IETF Comments" Markdown format][ICMF], You can use the
[`ietf-comments` tool][ICT] to automatically convert this review into
individual GitHub issues. Review generated by the [`ietf-reviewtool`][IRT].

[ICMF]: https://github.com/mnot/ietf-comments/blob/main/format.md
[ICT]: https://github.com/mnot/ietf-comments
[IRT]: https://github.com/larseggert/ietf-reviewtool
Robert Wilton Former IESG member
No Objection
No Objection (2023-11-30 for -05) Sent
Hi, 

   This document extends Simple Two-Way Active Measurement Protocol
   (STAMP) to implement performance measurement on every member link of
   a Link Aggregation Group (LAG).  Knowing the measured metrics of each
   member link of a LAG enables operators to enforce a performance based
   traffic steering policy across the member links.

I have to confess that this whole approach feels somewhat like a layer violation to me.  I perceive that the benefit of LAG is to increase link bandwidth and add some level of redundancy without the higher layers needing to be aware that the LAG interface is composed of separate physical Ethernet interfaces (in the same way, that most protocols don't need to know that a 400G Ethernet interface may be spread over multiple lambdas at the physical layer because that abstraction is hidden to the layers above).  I would argue that at the point that we are starting to steer traffic onto particular LAG members (beyond a local passive load-balancing operation) and to monitor and report the performance characteristics of individual members then perhaps the LAG abstraction is somewhat breaking down, and perhaps a cleaner solution would be to not have the interfaces in a LAG and to rely on something like ECMP instead.

However, I appreciate that my previous comment is not directly actionable by the authors and arguably this ship has already sailed with RFC 8668.  Hence, a potentially more actionable suggestion would be:

Should the document have any text regarding how these measurements should be used when individual LAG members fail, and I presume that the pinned member traffic is hashed to different LAG members instead?  Or to phrase this in an alternative way, are their operational deployment considerations about when and how this technology should be deployed and what other LAG configuration should or should not be used at the same time to result in a sane robust solution.

Regards,
Rob