Chairs:
Jeff Tantsura (jefftant.ietf@gmail.com)
Yingzhen Qu (yingzhen.ietf@gmail.com)
WG Page: https://datatracker.ietf.org/group/rtgwg/about/
Materials: https://datatracker.ietf.org/meeting/124/session/rtgwg
##
9:40
Multicast usage in LLM MoE
https://datatracker.ietf.org/doc/draft-zhang-rtgwg-llmmoe-multicast/
Sandy Zhang (10 mins)
9:50
Distributed Inference Network (DIN) Problem Statement, Use Cases,
and Requirements
https://datatracker.ietf.org/doc/draft-song-rtgwg-din-usecases-requirements/
Jian Song (10 mins)
10:00
Agent networking use cases, requirements, and architecture
https://datatracker.ietf.org/doc/draft-zl-agents-networking-architecture/
Nan Geng/ Li Zhang (10 mins)
Acee: If there is something need to be standardized, it is in
operations, not routing. Do you agree?
Li: We have not consider well which area it should belong to, maybe
the routing area can do some work for agent routing or selection of
audience.
Jie: There are two major type of use cases, one is about agent
gateway, that maybe require some routing mechanisms to distribute
the agent information. The second type case is more related to
operation, trouble shooting or monitoring.
Arashmid: I wonder whether you have looked at the agent gateway
protocol. The protocol itself is interesting but at the same time ,
it's important what we carry in that protocol and what is a summary
mechanism how do we summarize all these agents behind it. It reminds
me of BGP. I strongly suggest to take a look at the agent gateway
protocol and see whether there is anything in there that we can
augment. I'm not sure where the right place for it, because it's not
really routing. There is a talk about a new working group which is
basically going to be hopefully formed for the agent to agent
communications it might be useful to bring if that working group
gets established. I'm not sure routing is the right place.
Li: thanks for your comments, we will consider.
David Black: what do you want the IETF and specifically the routing
area to actually do?
Li: It's still under discussion. will response in the mailing list.
David: it will be more useful if you can come with specific
proposals about things to do or problems to be solved.
Tal Mizrahi: When you say scheduling, I assume you mean specific
time frame. What's your time accuracy?
Pavan: depends on the type of the job. Even in scenarios for short
lived job, you will need congestion avoid mechanism.
==============================================================
10:20
Fast Notification Problem Statement
https://datatracker.ietf.org/doc/draft-dong-fantel-problem-statement/
Jie Dong (15 mins)
David Black: Your characterization of ECN is not correct. Please
look at what's done in the transport area. For example, accurate ECN
work in the TCPM working group, and L4S in TSVWG. The same thing
about use cases, please focus on inter DC cases, that's where the
opportunity is to make a visible improvement. Inside DC, there is
limited opportunity.
Daniel Huang: Does the recipients of the notifications include
sending hosts? instead of networking nodes.
Yingzhen: The draft was first presented before the FANTEL BoF.
Keyur: Two points: (1) This is specific to one vendor's particular
silicon, please generalize it. (2) There is something similar in
L3DL in LSVR, happy to chat after.
Alia: This is specific to one vendor, happy to hear what other
options are in the space. I do have some minor technical questions,
will chat offline.
Jeffrey: Yes, it was extended from one implementation, but our
intention is that it should be generic.
David: these slides provide some motivations outside of a DC. For AI
clusters within data centers, there are other means of solving the
problem, collective communication library from NCCL for example.
10:50
Proxy for Congestion Notification
https://datatracker.ietf.org/doc/draft-xiao-rtgwg-proxy-congestion-notification/
Xiao Min (10 mins)
Joel Halpern: we're talking about fast notifications for events
which have very short lifetimes. Congestions come and go. Now you're
talking about congestion notifications over long distance, I'm not
sure about how that works. Now this is about proxy network, it's t
process the notification and transform it, that adds latency. I
can't reconcile.
Xiao Min: we will present some test results.
No questions asked.
===================================================================
No questions asked.
Sasha Vainshtein
00:14:55
Do I understand correctly that actual solution will be discussed in teh
BIER WG?
Sasha Vainshtein
00:15:55
That was about the multicast fot LLM
Jeff Tantsura
00:18:52
@sasha - there was indeed a presentation and discussion in BIER
Weiqiang Cheng
00:18:53
Sorry for missing voice due to bad multicast network;)
Weiqiang Cheng
00:21:20
My question: The current switch chips for AIDC can't support BIER, do
you need the new asic for the solution?
Greg Shepherd
00:24:55
Yes. Hardware BIER support is ideal. Jericho2 and Jericho3 both support
BIER. I believe their are others as well
Alex Nichol
00:26:31
There will be latency implications if DNX is required
Jeff Tantsura
00:27:03
the big question if GPU/NIC will implement BIER, and need for proxy on
the FHR
Weiqiang Cheng
00:35:02
When considering Broadcom's chips, for AIDC applications, the XGS series
is primarily used rather than the DNX series. For example, the Tomahawk
5, to the best of my knowledge, does not support BIER.
Yingzhen Qu
00:39:46
@meetecho, camera to the presenter please
Sasha Vainshtein
00:40:04
Lots of thanks to David Black for asking the question about teh previous
presentation.
Jeffrey Haas
00:42:48
As I'm noting in chat with someone else, rtgwg is the routing area's
dispatch group, so we get a lot of peculiar early work. That said, the
routing and forwarding discussion is only a portion of the work that was
presented. The minute you start having conversations, it starts being
more of an application protocol.
Jeffrey Haas
00:43:23
Also, no fair that there's another slurm than the one in RPKI. :-)
Jeff Tantsura
00:44:09
perhaps use of word "router" confuses people, either application router
or MoE router have nothing to do with packet routers :)
Adrian Farrel
00:46:37
@Jeff. Further, if there is only one hop (at the layer where the
"routing" is happening) then it is just a classification, steering, or
destination choice.
Yingzhen Qu
00:51:37
Please help with the minutes:
https://notes.ietf.org/notes-ietf-124-rtgwg?both
Adrian Farrel
00:53:09
I wonder whether we have a WG in RTG that could pick up new work related
to resource reservations and TE?
Adrian Farrel
00:53:23
Perhaps Pavan knows the chairs
Jeff Tantsura
00:53:33
@adrian - hmm....
Shaofu Peng
00:54:27
even using resource reservation, may still get congestion due to link
failure...
Boris Khasanov
00:55:06
@Adrian - CATS?
Sasha Vainshtein
00:55:22
@Adrian - I second Jeff's reaction😉
John Scudder
00:55:35
To @Tal Mizrahi's question, my impression from the talk and a very brief
scan of the draft (I was not involved in its development) is that the
word "scheduling" may be misleading; that this is traditional TE-style
resource reservation, not DETNET-style. (But maybe I've misunderstood
either the question or the draft...)
Adrian Farrel
00:56:26
I think my sarcasm comes around...
If you want to TE the network, go ahead. If you want to schedule server
resources (there are plenty of places doing that work). If you want to
coordinate this work then maybe CATS or NMRG
Jeffrey Haas
00:57:04
... we say as we're about the spend our time talking about congestion in
the routing area.
Jeffrey Haas
00:57:33
(drafts, I mean. not the plentiful other places we create congestion)
Adrian Farrel
01:00:56
Well, I am also a little puzzled as to whether Fantel is RTG, although
the consumer of notifications is routing/steering, not throttling. So
posibly right
Vishnu Beeram
01:03:36
@Adrian -- To TE or not to TE is no longer the question :) -- we didn't
take it to CATS because this is not traffic steering; the tools for
doing MPTE reservation are being discussed in TEAS and will continue to
get discussed there; the reason for presenting this draft here is
because it is related to the fantel conversation..
Tony Li
01:04:41
It would seem like 'scalable' is also a requirement. Is there any
evidence that all of these requirements can be concurrently satisfied?
Mike McBride
01:05:52
@Adrian - RTG because need to determine how fast notifications should be
delivered (new protocols, extensions, UDP-based...)
Alia Atlas
01:08:27
There does seem to be a wide extension around a basic problem. It'd be
interesting to understand how each is intended to work independently. I
do see the work making sense in RTG - for the impact on IP
forwarding/steering and for the fast-notification protocol.
Boris Khasanov
01:08:41
@Tony, yes - multidimensional scale is very good question
Tony Li
01:09:08
This seems like it's now getting into solutions. And losing
'lightweight'.
Jeff Tantsura
01:09:47
how fast is fast?
Mike McBride
01:10:17
millisecond, sub millisecond?
John Scudder
01:10:27
Same-day service. :-P
Adrian Farrel
01:10:35
FTL
Jeffrey Haas
01:10:36
The place where there's existing demonstration of some of the overlap is
the correlators in routing for the congestion notification. See
draft-ietf-idr-next-next-hop-nodes as one example. That said, where the
congestion mechanism it lives is what we're talking about.
Jeff Tantsura
01:10:38
@john :)
Carolina Caeiro
01:11:01
Guys, I am a bit confused by process here. I understand that the result
of the Fantel BoF is that there wouldn’t be a new group, and that this
should be directed here. Now, we are discussing whether and what angles
of this work could be adopted here?
Tony Li
01:11:02
My understanding is that we need to prevent all congestion based packet
loss, so we need to respond faster than the buffer space of anything on
the path.
Reshad Rahman
01:11:14
The answer to the 2nd question (DP only?) depends on the answer to the
1st question (fast notifications only or notifications in general?)
Nitsan Dolev Elfassy
01:11:38
The desired notifications nature seems somehow clear my problem is lack
of examples for expected realistic use of this information.
Adrian Farrel
01:11:55
@Carolina I believe the word was "incubated". That might result in: no
RFCs a cluster of RFCs in RTGWG a future WG
Jeffrey Haas
01:11:58
+1 to tli. Multi-hop distribution of this stuff will have interesting
congestion, latency, loss, and jitter issues. That might be fine for
"slow" stuff like WAN use cases. For AI/DC?
Adrian Farrel
01:12:51
@Jeffery it's like BFD congestion order n factorial
Jeffrey Haas
01:14:45
Yeah, I might have some sensitivities to this discussion for that
reason.
Reshad Rahman
01:17:17
@Zafar I think keeping the scope small and focusing on fast
notifications is preferable
Alia Atlas
01:17:49
For the questions on how data can be used, I did find that
draft-cheng-rtgwg-adaptive-routing-framework-04 was useful. I'm not sure
how that is perceived to fit into the potential broader scope.
Jeffrey Haas
01:18:05
Greg Mirsky doesn't appear to be present to discuss, but I find it
likely that we'll end up talking about a previously discussed solution
space - leveraging distribution machinery similar to BFD (but not using
BFD!) to carry streams of congestion info. the router-info draft does it
differently.
https://datatracker.ietf.org/doc/html/draft-mmm-rtgwg-integrated-oam-02
Jeffrey Haas
01:19:29
The general pacing of this sort of state will set a significant portion
of our requirements for the solution protocol. Steady? On demand? etc.
Tony Li
01:20:41
Steady would seem to contradict 'lightweight'
David Black
01:20:59
steady -> periodic ?
Alia Atlas
01:21:38
It's a trade-off - more accurate reporting of congestion speed vs.
processing
Jeffrey Haas
01:22:04
Yes, periodic. And to tli's point, the contents vs. rate sets a lot of
the discussion about bfd. "What do you want to tell the other side about
every < 30ms?"
Jeffrey Haas
01:22:54
The rate also may push the solution to TLV or to template based.
Jeffrey Haas
01:23:16
TLVs make routing people happy. templates make OAM people happier in
some circumstances.
Tony Li
01:23:39
The format seems like a trivial issue.
Jeffrey Haas
01:24:11
The implementation is "trivial". The choice tends to be important, if
contentious.
Alia Atlas
01:25:07
extensibility and ability to have changes
Jeffrey Haas
01:25:31
I nod towards ipfix as an example of template based that is extensible.
Jie Dong
01:26:58
One thing I forgot to mention is whether we want to cover both in-band
and out-of-band notifications, or only one of them?
David Black
01:30:39
Agree with Jeff - point is that the CCLs avoid the traffic pattern that
causes problems.
Jeffrey Haas
01:31:49
@jie Part of that depends on whether you're discussing single hop or
multi hop distribution of the state and whether multi-hop is congruent
with the forwarding path or not.
Alia Atlas
01:32:05
@Jeff - fair enough. I just think naturally in TLVs - but doing this
with low computation matters.
Adrian Farrel
01:32:14
@Jie Aaaaaaaaagh! You said "in-band" and "out-of-band"
Jeffrey Haas
01:32:35
He could use other scare words like iOAM if you like.
Adrian Farrel
01:32:37
Please be super-precise and not use these broken-i-a-packet network
terms
Adrian Farrel
01:33:23
https://datatracker.ietf.org/doc/draft-ietf-opsawg-oam-characterization/
\:-)
Jeffrey Haas
01:33:25
To tli's point, GLB is just one use case enabled by such a message bus.
Jeffrey Haas
01:34:30
To Jie's point, what use cases get enabled by other similar message
buses depends on forwarding model for the messages and how they are
routed and with what correlators.
Jie Dong
01:34:32
@Adrian :)
Jeffrey Haas
01:34:54
and yes, many OAM folk would just call out "amateurs!"
Maria Matějka
01:37:38
joel: +1
Jeffrey Haas
01:43:52
For this srv6 presentation, I suspect the folk in tcp would share wisdom
about this sort of thing being solved in the application layer rather
than strictly forwarding.
Joel Halpern
01:47:06
It is unclear what "local traffic control" means, or how it could
possibly help the problem. If all it means is more aggressive discard,
then I can at least understand the quesiton. If it meant that, it should
say that.
Jeffrey Haas
01:47:35
I interpreted one case of it as rate shaping.
David Black
01:47:38
@Jeff - +1, and generalize beyond TCP - this sort of separate
notification should complement what the transport protocols are doing
in-band.
Joel Halpern
01:50:02
Expcting P nodes to perform rate shaping (as implied in both this and a
likely reading of the previous presentaiton) seems like a very bad idea.
Jeffrey Haas
01:50:18
I intend to agree, Joel.
Jeffrey Haas
01:51:43
The rsvp style case moving the conversation toward distributed rate
shaping is likely where much of this needs to go. As noted to someone
else earlier this week, if we're not capable, we'll reinvent ATM.
Jeffrey Haas
01:52:07
s/capable/careful
Adrian Farrel
01:52:24
Oh, is that the date, alrady? Yes it is tim to reinvent ATM
Jeffrey Haas
01:52:38
RFC 1925 s-what?
Donald Eastlake
01:52:47
I was thinkiing how much this sounded like ATM.
Joel Halpern
01:53:05
We could instead just re-invent X.25.
Boris Khasanov
01:53:17
\:)
Tony Li
01:54:29
ICMP is frequently used for DoS attacks, thus it is rate limited by all
IP nodes.
Jeffrey Haas
01:54:47
My ICMP comment at the microphone was this: ICMP is often treated
negatively by all forwarding paths. it is rate limited, and often
treated poorly in precedence in the forwarding paths.
Adrian Farrel
01:54:55
I always missed two things in RFC 1925 Rule 3: there is no rule 3 "For
more information, please re-read this RFC."
Nitsan Dolev Elfassy
01:55:03
IMHO, this proposal is extremely non scalable.
Tony Li
01:55:44
@Nitsan Dolev Elfassy This seems to be the case starting with the
problem statement.
Adrian Farrel
01:55:45
What Tony says and all of the tools being discussed may easily become
DoS vectors
Jeffrey Haas
01:56:30
People are already unhappy with bfd multihop security considerations.
General network congestion mechanisms will have significantly scarier
security considerations when done multi-hop.
Sasha Vainshtein
01:56:37
Quoting (3) from RFC 1925:
Sasha Vainshtein
01:57:21
With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea. It is hard to be sure where they are going to
land, and it could be dangerous sitting under them as they fly overhead.
David Black
01:57:26
Credit-based flow control (cbfc) comment - CBFC is complex-enough, doing
this per flow significantly increases complexity of an already-complex
RSVP implementation.
Jeffrey Haas
01:58:02
See also prior work why original RSVP (not RSVP-TE) didn't gain general
popularity.
Tony Li
01:58:16
@David Doing anything in RSVP would seem to contradict being 'fast'.
Sasha Vainshtein
01:58:21
@David Black - looks like (3) frpm RFC 1925 really applies...