Skip to main content

Telechat Review of draft-ietf-cats-usecases-requirements-12
review-ietf-cats-usecases-requirements-12-artart-telechat-bray-2026-01-15-00

Request Review of draft-ietf-cats-usecases-requirements
Requested revision No specific revision (document currently at 14)
Type Telechat Review
Team ART Area Review Team (artart)
Deadline 2026-01-20
Requested 2026-01-14
Authors Kehan Yao , Luis M. Contreras , Hang Shi , Shuai Zhang , Qing An
I-D last updated 2026-05-20 (Latest revision 2026-02-02)
Completed reviews Rtgdir Early review of -07 by Ines Robles (diff)
Tsvart IETF Last Call review of -10 by Zaheduzzaman Sarker (diff)
Dnsdir IETF Last Call review of -10 by Jim Reid (diff)
Genart IETF Last Call review of -10 by Roni Even (diff)
Artart IETF Last Call review of -10 by Tim Bray (diff)
Secdir IETF Last Call review of -11 by Daniel Migault (diff)
Rtgdir IETF Last Call review of -10 by Linda Dunbar (diff)
Opsdir IETF Last Call review of -12 by Samier Barguil (diff)
Artart Telechat review of -12 by Tim Bray (diff)
Tsvart Telechat review of -12 by Zaheduzzaman Sarker (diff)
Assignment Reviewer Tim Bray
State Completed
Request Telechat review on draft-ietf-cats-usecases-requirements by ART Area Review Team Assigned
Posted at https://mailarchive.ietf.org/arch/msg/art/Yp7rvL3DGXPuFT7Z0clbiRj_Qys
Reviewed revision 12 (document currently at 14)
Result Ready w/issues
Completed 2026-01-15
review-ietf-cats-usecases-requirements-12-artart-telechat-bray-2026-01-15-00
This is the second ARTART review and has no special standing.

This review concerns draft-ietf-cats-usecases-requirements-12. This draft is
substantially improved since the last draft reviewed, but still has a few
issues.

General:

There is no mention of CDNs? Are they not an important piece of the puzzle?
Many of the CDN providers advertise the availability of edge compute in various
flavors.

It says in 4.2 that it's a requirement that CATS is entirely
application-transparent. Shouldn't that show up as one of the numbered
requirements?

There is very little discussion of databases, which I think is a gap.  In many
real-world scenarios (a majority?) the performance of distributed systems is
dominated by a shared database which maintains system state.  I think the
discussion of metrics and bottlenecks would benefit from a mention of this. 
This pops up at several points in the detail points that follow.

Abstract

"follow and use" - why not just "use"?

The phrase "the typical scenarios" suggests that those offered are *the* most
common.  Maybe just "and typical scenarios"

1.

"Regular capacity expansion of a single site is neither practical nor
economical."  Lots of people do this, could possibly acknowledge this reality
by saying "is often neither…"

"However, existing routing schemes and traffic engineering methods fall short"
This is often not true. Need to acknowledge with a "sometimes" or "often".

3.2 ALTO and Anycast are mentioned. How about good old ICMP?

4.1

This section is now quantitative and now much better. One problem: In my
experience with AR/VR, the server side provides a scene graph and images from a
database, while the GPU work is done on the client, in something like the
Vision Pro. There's typically not enough network bandwidth to send the GPU
*output*, just its input.  So I think the paragraph beginning "The conclusion
is…" needs some work to revise the GPU-specific language. It takes plenty of
CPU to build a scene graph, so the analysis is still broadly correct.

4.2 "high-quality video and LIDAR data" would be more accurate

4.5.2 "platforms for federated or distributed training are being positioned,
specifically when employing edge computing resources." Really?  Would love to
see a citation here.  I am not convinced that this is really a thing.

5. Requirements. Maybe I'm being pedantic, but all this 2119 language in an
informational RFC bothers me.  Whenever in RFC says something MUST be some way,
shouldn't there be a crystal-clear way to ascertain whether it is nor not?

5.1

The first sentence here is very good, and I would try to include it in the
Abstract or Introduction.

What does a CS-ID identify? An instance of a particular user of a particular
application?  It seems like an important concept. If there's a nice crisp
reference and I missed it, oops.

5.2

Note that in a high proportion of cases, the limiting factor in a distributed
application is a shared database rather than CPU or networking.  The metric
that gets interesting in that case is the throughput in transactions per
second, which might be low even when the CPU load is not high.

In the real world, it is very common to see metrics associated with
percentiles, for example "P99 latency".  To the extent that when someone says
to me "I need 150ms latency" I suspect they're don't know what they're talking
about if they don't say P50 or P95 or some such.  It feels weird that this
isn't even mentioned as an issue.

R6. For this requirement to be enforced, there must be supporting metrics (not
sure what they would be) describing the performance of the Resource Model.

5.3 What’s a "CATS Domain"?

R14: The beginning of the sentence is missing? I don't understand what it's
trying to say and to the extent I do, it seems unrealistic? Of course the
performance of the CATS facility is going to be sensitive to the metrics update
frequency.  How could it not be?

5.4.

The discussion of stateless services and the value of HTTP-style requests is
good, but also implies, as mentioned above, that there's a database somewhere
that maintains the service state.  Which is to say, once again, that its
performance is apt to be critical in the CATS context.

The paragraph beginning "However, a client, e.g., a mobile UE, may…" is
troublesome. If I have multiple client apps on a field device, typically they
connect to service endpoints through DNS names, and the DNS names are
service-specific. In fact, load balancing is commonly accomplished by fiddling
DNS tables. It would be perfectly reasonable for different apps to have zero,
some, or lots of instance affinity requirements.

R15. What does "per-flow" mean?

5.5

R18 is huge and complex and intersects the legal/regulatory environment, which
should be stated. This probably requires some metrics and quite a lot more
discussion. (There is some in the Security Considerations section, which should
probably be stated.)

6.

"Description: Attackers may spoof legitimate service instance identities" -
really? In the case where the traffic is HTTPS, it becomes difficult to spoof
service instance identities.

"Description: Attackers may tamper with core scheduling metrics or submit false
data" - once again, one would hope that CATS traffic offers at least TLS levels
of security and exclude plaintext.  This document should be clear on this.
Seriously, anything that wants to have a chance of getting through the IETF
process in 2026 needs to be rigorous on this point.