Computing-Aware Traffic Steering (CATS) Problem Statement, Use Cases, and Requirements
draft-ietf-cats-usecases-requirements-14
Yes
Jim Guichard
No Objection
Paul Wouters
Abstain
Recuse
Note: This ballot was opened for revision 12 and is now closed.
Jim Guichard
Yes
Andy Newton
No Objection
Comment
(2026-01-21 for -12)
Sent
# Andy Newton, ART AD, comments for draft-ietf-cats-usecases-requirements-12 CC @anewton1998 * line numbers: - https://author-tools.ietf.org/api/idnits?url=https://www.ietf.org/archive/id/draft-ietf-cats-usecases-requirements-12.txt&submitcheck=True * comment syntax: - https://github.com/mnot/ietf-comments/blob/main/format.md * "Handling Ballot Positions": - https://ietf.org/about/groups/iesg/statements/handling-ballot-positions/ ## Thanks to the Reviewers Thanks to Tim Bray for the ARTART review. ## Comments Thanks for this document and the work that went into it. ### Applicability to all networks 907 R9: CATS MAY be applied in non-CATS network environments when needed, 908 considering that CATS is designed with extensibility and could work 909 compatibly with existing non-CATS network environments when the 910 network components in these environments could be upgraded to know 911 the meaning of CATS metrics. Is the intent of this requirement to allow a CATS solution to be applicable outside of CATS environments? In other words, a solution can be considered a CATS solution even if it useful in non-CATS environments? Or is it only a CATS solution if it is applicable to a non-CATS environment that will be upgraded to a CATS environment? If it is the former, do you think a clarification is necessary? Thanks.
Deb Cooley
No Objection
Comment
(2026-01-22 for -12)
Sent
Thank you to Daniel Migault for their secdir reviews. Please spell out acronyms (that are not listed w/ a * in https://www.rfc-editor.org/rpc/wiki/doku.php?id=abbrev_list). This includes SD-WAN, VR, AR, etc. Section 6: While R19 remains a broad requirement, R20, R21, R22 immediately jump to encryption, even though there appear to be requirements for authentication, integrity, and confidentiality. Section 6, R20: The requirement for authentication does not normally lead directly to encryption. In fact, some forms of 'encryption' do not provide authentication, i.e. the data can still be modified. Perhaps there is a requirement for proof of authentication over the data? Digital signatures are often a suggested mechanism to meet a requirement like this. Section 6, R21: The requirement for 'encryption' is not sufficient. The requirement for integrity can be done in conjunction with R20 to protect the data. It can also be met by using an encryption algorithm with an Authenticated Encryption with Associated Data (AEAD) mode. Section 6, R22: End-to-end encryption does not always prevent the modification of the underlying data (for example, homomorphic encryption). Using an AEAD mode with your favorite encryption algorithm will protect the data from modification as well as disclosure. ICC codes are also a way to protect the integrity, although there are issues (which is why they have fallen out of favor).
Erik Kline
No Objection
Comment
(2026-01-19 for -12)
Sent
# Internet AD comments for draft-ietf-cats-usecases-requirements-12 CC @ekline * comment syntax: - https://github.com/mnot/ietf-comments/blob/main/format.md * "Handling Ballot Positions": - https://ietf.org/about/groups/iesg/statements/handling-ballot-positions/ ## Comments ### S5.1 * "mapping CS-ID to one or more current service instance addresses" Surely there are circumstances where a CS-ID might have *zero* instances? (e.g. failure scenarios) What are the requirements around any of these special cases? ### S5.3 * How can R14 be met? It seems like every possible solution would be "dependent on or vulnerable to" the mechanisms that gather and distribute the data used by the solution, no? ## Nits ### S3.2 * "routering systems" -> "routing systems"? * "get a poor latency sometime" -> "experiences poor latency", perhaps
Gorry Fairhurst
(was Discuss)
No Objection
Comment
(2026-01-28 for -13)
Sent
Thanks also to Zhed for his TSV-ART review and thanks to the authors and the WG for their work on this document and in resolving my DISCUSS topics. Comments (non-blocking) follow: Abstract : I am curious about the more factors" mentioned in the introduction, could these be enumerated to set the focus of the document? I see the following text: "However, it is undeniable that.." c- which seems a very strong assertion fro rthe iETF. IS it possible to rephrase this?
Paul Wouters
No Objection
Ketan Talaulikar
(was Discuss)
Abstain
Comment
(2026-01-28 for -13)
Sent
Thanks to the authors and the WG for their work on this document and the discussion of my comments. While the v13 introduced some improvements, it does not substantially address the points that I had raised. This comes down to the WG consensus to publish this document in its current state as opposed to my view that the document is not ready to be published (basically that it is still work-in-progress) since it leaves out (IMHO) important aspects that need to be addressed for the document to be helpful as an IETF RFC reference. I am changing the position of my ballot from DISCUSS to ABSTAIN to not stand in the way of the WG consensus after discussions with the responsible AD. Original DISCUSS ballot contents follow: I have some points that I would like to discuss. <discuss-1> Should the problem statement not include gap analysis? The objective of the CATS solution/framework is clear in the charter itself. I believe one of the goals of the Problem Statement analysis is to identify gaps in "how things are done today" (as in the functional blocks that exist and are in use today), what are the gaps that need to filed (new capabilities introduced by CATS - this is covered), and the external interfaces/interactions between the new and old to be developed. I find the Problem Statement portion lacking these details. There are no citations for the "current state of art" except for ALTO (not sure of its real world use) and the use of anycast. Mainly I was looking for the clients application interaction with the CATS solution. Section 3.2 says: "When a client issues a service request for a required service, the request is steered to one of the available service instances." The details of what the envisioned/supported ways of making such a service request and what are gaps to "steer" it to available service instances are missing. Is this a DNS query? Are DNS servers meant to interact with CATS. Being a routing person, this is not my area of expertise and I was looking for some analysis from a routing area WG (with help of inputs from INT/ART/WIT) to cover these aspects (with appropriate citations). <discuss-2> Is there necessary clarity to publish these use-cases at this juncture of the development of the CATS framework/architecture? The use-cases do not describe the current state of art (neither do I see adequate citations for them). In my understanding, all of these use-cases are already deployed and in use currently. They are just being implemented differently and the development of CATS solution is supposed to integrate in these use-cases and improve them. Please correct me if I am wrong. If I am correct, then it would be good to know these improvements or new capabilities and how they would integrate with CATS. More importantly, the use cases come across as claims of how CATS architecture/framework would 'solve their problems and provide improvements' (my words) while the CATS systems is still under specification and not yet implemented (let alone integrated/deployed). Is it appropriate to publish this in an RFC at this juncture? My view is that we are not there yet and the use-cases are better published as RFC after integration/deployment with CATS to as practical and relevant blueprints. <discuss-3> Requirements - do we have the necessary and sufficient set? I've found the requirements related to the routing aspects (e.g., metrics) to be sufficiently worked out - it is OK for them to be at a high-level at this stage. However, I am missing the requirements for the glue/integration/adaptation with existing components in the application layer but also the routing system. Are we sure we have everything to publish? Or is it a work in progress? Looking at ISAC in appendix A and the mailing list discussion on it, looks like it has not been covered and fully fleshed out. Likely more is needed to be done. But I don't see a conclusion of that discussion or the WG decision - (a) the WG is desirous of investigating this requirement and covering it, OR (b) the WG does not believe this to be a problem to solve. I get the impression that authors are saying (a) but they don't want to take it up as yet. This is perfect. But yet another reason to wait and not publish the requirements?
Roman Danyliw
Abstain
Comment
(2026-01-21 for -12)
Sent
Thank you to Roni Evan for the GENART review. I am choosing to ABSTAIN on this document. I was challenged in evaluating this document given the level of detail it provided and the precision by which the notional architecture, use cases, and requirements were explained. My specific feedback is as follows: ** Section 1. Network operators are increasingly deploying computing resources, with a focus on enhancing edge capabilities to support services requiring low latency, high reliability, and dynamic resource scaling. Is it just “network operators”? Is “network operators” the right term? I was under the impression that the “entire cloud industry” and “content delivery industry” (which also provides compute for application) builds out infrastructure geographically proximate to their customers. ** Section 4.1. What is the difference between a normal cloud (e.g., https://aws.amazon.com/, https://www.tencentcloud.com/) and an “edge cloud” described here? ** Section 4.2 Urban intelligent transportation relies on a large number of high- quality video capture devices, whose data needs to be processed at edge service sites (e.g., pedestrian flow statistics, vehicle tracking). This imposes stringent requirements on the computing capabilities of edge service sites (for video processing) and network performance (bandwidth, latency). CATS can address the issue by coordinating network and computing resources. Given how specific the previously example in Section 4.1, could these “stringent requirements” be defined more precisely? ** Section 4.2 In auxiliary driving scenarios (for example, "Extended Electronic Horizon" [HORITA]), edge service sites collect road and traffic data via V2X to address blind-spot and collision risks, and provide real- time warnings and maneuver guidance. Requests are typically sent preferentially to the closest edge node. As above. Is this technology widely deployed enough to provide the field experience to frame the existing gaps for CATS to solve? ** Section 4.3. I am having trouble understanding the link between digital twin models which can be run on a number of platforms and this CATS work. It isn’t clear to me how the industrial control application is a novel use case demonstrating CATS. ** Section 4.5.1 As described here, it is unclear what about this use case is related to edge services, rather than just “traditional data centers”. ** Section 4.5.2 Although large language models are nowadays confined to be trained with very large centers with computational, often GPU-based, resources, platforms for federated or distributed training are being positioned, specifically when employing edge computing resources. I don’t understand the use case. Can more be said about the circumstances where the significant computing resources required to train models are being distributed across _edge_ resources? This seems the opposite of the trends to grow larger, and larger data centers. Could this edge resource be better explained? What class of users is seeking this type of solution. ** Section 5. In the following, we outline the requirements for the CATS system to overcome the observed problems in the realization of the use cases above. It isn’t clear with some of these requirements what novel requirements are being defined for CATS. Some of these seem akin to design requirements of building any IT system. A non-comprehensive annotation of the requirements is as follows: -- Section 5.2 R3: The implementations MUST agree on using metrics that are oriented towards compute capabilities and resources and their representation among service instances in the participating edges, at both design time and runtime. Does this amount to saying any system (CATS or otherwise) needs to agree on the data format that it exchanges (in the case of CATS, metrics)? -- Section 5.2 R6: The Resource Model MUST be executable in a scalable manner. That is, the Resource Model MUST be capable of being executed at the required time scale and at an affordable cost (e.g., memory footprint, energy, etc.). Does this amount to saying any system (CATS or otherwise) needs to scale to the size of the use case? -- Section 5.2 R8: All metric information used in CATS MUST be produced and encoded in a format that is understood by participating CATS components. What is the alternative to this design for any system? Doesn’t any system component (CATS or otherwise) need to understand the data it exchanges (CATS metrics or otherwise)? -- Section 5.2 R9: CATS MAY be applied in non-CATS network environments when needed, considering that CATS is designed with extensibility and could work compatibly with existing non-CATS network environments when the network components in these environments could be upgraded to know the meaning of CATS metrics. Does this amount to saying that any new network technology (CATS or otherwise) could be brought to a network that that doesn’t use it if this network is updated to use the newer networking technology?
Mohamed Boucadair
Recuse
Comment
(2026-01-14 for -12)
Not sent
As I'm a contributor to the document.