Minutes IETF123: rtgwg: Mon 12:30
minutes-123-rtgwg-202507211230-00
| Meeting Minutes | Routing Area Working Group (rtgwg) WG | |
|---|---|---|
| Date and time | 2025-07-21 12:30 | |
| Title | Minutes IETF123: rtgwg: Mon 12:30 | |
| State | Active | |
| Other versions | markdown | |
| Last updated | 2025-07-30 |
IETF 123 RTGWG Minutes
Chairs:
Jeff Tantsura (jefftant.ietf@gmail.com)
Yingzhen Qu (yingzhen.ietf@gmail.com)
WG Page: https://datatracker.ietf.org/group/rtgwg/about/
Materials: https://datatracker.ietf.org/meeting/123/session/rtgwg
##
14:30-16:30 - Monday Session III, July 21, 2025
-
14:30
Meeting Administrivia and WG Update
Chairs (10 mins) -
14:40
Destination/Source Routing
https://datatracker.ietf.org/doc/draft-ietf-rtgwg-dst-src-routing-revive/Shu Yang (10 mins)
- David Lamparter(Co-author): Source selection tested at hackathon,
src/dst routing is the only thing that got proper RFC 8028
compliance. - Eric: Why compare dst/src routing with SRv6, they are completely
different. - Shu: They are two techniques to achieve load balancing. We'd like to
compare them in the control cost. But it doesn't mean that one is
much better than the other.
-
14:50
YANG Data Model for IPv6 Neighbor Discovery
https://datatracker.ietf.org/doc/draft-ietf-rtgwg-ipv6-address-resolution-yang/Fan Zhang (10 mins)
- Acee: Suggest to do an analysis of RFC4862 and 4861 to see if
there's anything left in the base specification. I'd like to see
what functions from the base ND specification are covered and and
where they're covered, and what don't cover in the description of
the model . Secure ND is in a in a separate RFC, it's all right to
remove that. Will review it again. - Ron: Any review from 6MAN or posted to 6MAN? 6MAN owes ND.
- Yingzhen: Not yet, this one is relatively new, expanded from ARP. It
should be. - Eric: As INT AD,this draft should move to 6MAN. Comments on the
model, send is not used anywhere and when you say out or in router
solicitation that's very important to differentiate between unicast
and multicast, and you can receive multiple arrays from multiple
routers in this model. - Jen: As 6MAN chair, send it to the 6MAN list. We can mention it in
the chair slides and ask people to pay attention. ARP and ND should
be separated. Maybe it would make sense to have a document focused
on ND reviewed by 6MAN. - Jeff: Chairs will talk about it. It seems like moving the ND part to
6MAN is pretty obvious.
-
15:00
SRv6 Path Egress Protection
https://datatracker.ietf.org/doc/draft-ietf-rtgwg-srv6-egress-protection/Xinxin Yi
- Jim Guichard: Why the draft is not in SPRING, and why is the
protocol extension not in LSR. Has this been socialized with SPRING
and LSR? - Yingzhen: The draft was presented in SPRING and lsr before. RTGWG
has been working on all FRR related protection related issues. We
had a discussion among SPRING, RTGWG and LSR, it was decided to have
RTGWG as the home of this draft, but it had been presented
everywhere. - Jim: When you do the WG last call, it needs to be copied to SPRING
and LSR. - Sasha: How this draft applies to BGP service SIDS. What PEA
advertises probably would be a service SID, IGP wouldn't be aware of
any of this SID. Whether this draft is supposed to to provide node
protection for BGP based services? - Xinxin: P1 is the PLR and the PEB advertises the information which
has the mirror SID. When the PEA failed, P1 can reroute quickly to
the P2, PEB and to CE2. - Sasha: How does P1 which is not a BGP speaker know anything about
BGP services SIDs. If this only works with node and adj SIDs which
are advertised in IGP this should be explicitly stated in the draft,
if you want egress node protection function to be extended to BGP
based services, then something is missing on the draft, and it
should be also discussed with the BESS WG.
-
15:10
Discussion about BFD-based solutions for faster VRRP convergencehttps://datatracker.ietf.org/doc/draft-ietf-rtgwg-vrrp-bfd-p2p/
Aditya Dogra, Greg Mirsky (20 mins)
-
Yingzhen: There are two drafts addressing fast VRRP failover using
BFD, we'll have one from each draft to present their idea mainly
focus on the advantage and disadvantage of each solution and then we
see we get the working group opinion how we want to proceed. -
Acee: Have read both drafts. The first one has some good ideas but
it's really at the expense of changing VRRP. I'd like to see the use
cases where you have more than two or three nodes,e.g, full mesh
connection. The other thing, it seems like people are using BFD
without doing something like this. They're either configuring it or
now we have the unsolicited BFD, the session list. So the backups
could just say that there's the active, and they could say, it's
going to form a P2P session with them." And if there's only two or
three, you're not saving that many sessions at the expense of making
the protocol much more complex with the backup. The positive side,
this is kind of like the OSPF concept of a backup designated router.
So you have precomputed if you have and you don't really have to go
through an election but normally if you only have a few, the jitter
will handle that anyway, so you're really not saving a lot of time
because the jitter would handle it. -
Aditya: We can also mention more deployments in the draft, in case
of enterprise it's mostly one or two backups but in data centers
usually we have higher number of backups required. There are
proprietary solutions which talk about four-way HSRP and others they
are meant only to bring more number of backups. This draft is also
looking from the deployment point of view, if there are
implementations such as unsolicited BFD, we can take a look. But I
don't know how it will solve the problem of having the less jitter
time.
(About the second draft.)
- Acee: Two very blocking comments. The first is you don't need to
gratuitously modify the BFD packets given that you're using this
encapsulation and you've already configured these routers to use
p2mp. You can multiplex based on the primary address and don't need
to modify the BFD packet. Look at all the other protocols. OSPF
doesn't in its hellos. It doesn't have any discriminators. You don't
have to do that. - Greg: Disagree because on the same segment we can have multiple VR
ids. - Acee: You can demultiplex on the primary address that's being backed
up and you've already configured these guys to be in this P2MP tree.
So you can't say that they don't know that they're running VRRP and
using BFD for it. - Greg: RFC8562 requires P2MP BFD session being demultiplex based on
the socket information the source address and My Discriminator. - Acee: Rather than changing VRRP, it would be much better to say that
in this case we're deviating and just demultipexing based on that,
take all that unnecessary stuff out. People who don't implement the
protocols always come up with all these complicated encodings. - Greg: We agreed that My Discriminator is a useful extension to the
protocol. But whether it's sufficient to do demultiplexing of BFD
sessions not using discriminated information from BFD point of view,
we need to look at it. It might not be ne sufficient the source
address. - Acee: I'd like to see that justification. The other thing is the
other draft specifies the changes to the VRRP state machine . Where
you're going to inject at least handling the event. I'm hoping you
don't add anything to the sending and receiving VRRP packets. - Greg: No. Basically they're detecting that BFD session is down, it's
the same as detecting it from VRRP messages. It's the same event. - Acee: If you're not modifying the state machine, you're not
modifying sending and receiving the packet, which implies that
you've agreed not to add tthe discriminator and the flags to the
VRRP packet. So, you're going to take that out. -
Greg: We need to look at from the BFD point of view whether it's
sufficient to demultiplex BFD sessions not using discriminator
information. But from a VRP state machine, you are correct. This
proposal does not change it because it acts the same as detecting
that VRRP messages being missed. It allows to run VRRP messages at a
much slower rate and enable their fast detection using BFD messages. -
Acee: Less opposed to the first one than this one in this current
form. -
Sasha: What happens if one of the non-active backup routers
participating in the in a given VRRP group doesn't support P2MP BFD.
I suspect that this would mean that this router would be overloaded
with tons of multiccast packets that it would receive. It would have
to trap them because the destination address at least by default is
supposed to be the same as used in the VRRP address advertisement
messages. Then the control plane had to respond with something like
destination unreachable/port unreachable. The second question is
what is supposed to happen to the hosts on the LAN whose VRRP is
deployed? Would they also receive this fast flow of P2MP BFD packets
or would you expect the switches implementing this LAN to not to
forward these packets to be explicitly configured not to forward
these packets to hosts attached to them. These two questions should
be addressed when you are going to use P2MP BFD and the main
advantage of using one whole IP BFD maybe in non-solicited mode as
already defined would be eliminating all these problems. -
Greg: From ethernet encapsulation point of view, what we suggest is
P2MP BFD control packets being encapsulated the same way as VRRP. So
if somebody considers that using VRRPv3 at 10 millisecond interval
is acceptable then I wonder why it would not be acceptable to do for
P2MP BFD packets. - Sasha: The very idea is to substantially reduce the rate of VRRP
advertisement packets while still providing fast protection.
POLL: Do you think we should only have one solution? Yes(43) No(4) No
Opinion(26)
-
15:10
Lightweight Host Routing using LLDP
https://datatracker.ietf.org/doc/draft-filsfils-rtgwg-lightweight-host-routing/Dan Bernier (10 mins)
- David: Strongly opposed. LLDP is specifically designed to bypass
STP. You can't get LLDP if the port is blocked in STP. The worst
case you have a route that is pointing to a port that is blocked in
in STP. It's a fixable problem. Secondly, on the use of MACsec. The
multiccast group used to establish the keys and SA for MACsec is not
the same as for LLDP, MACsec has longer range. You have keys
negotiated with another device that is not in fact the same device
that is processing LLDP. So you need to either run LLDP in a
non-standard configuration or MACsec in a non-standard configuration
to get keys to use MACsec for this. This is a mix of layering that
doesn't belong with each other. Using LLDP is not the same as using
the limited scope multiccast addresses in ethernet. - Weiqiang: This is a good solution. We have a draft using similar
mechanism but dedicated for SRv6. Maybe some discussion and
collaboration on that. - Jeff Haas: Agree with David. Especially about the scoping stuff
since we actually went through this exercise for BGP auto discovery
as well under layer 2. The max scoping is a big deal. The packets
can only be a certain size for LLDPv1, you're going to be pushed to
LLDPv2 for this and you just radically increased it on top of the
fact that default timers would have this timing out in 120 seconds.
Are we about to see BFD for LLDP static routes? -
Acee: Not terribly opposed to it, but we should be very careful
about adding more RIB clients and route types and especially an L2
protocol installing L3 routes. It seems that you could use RIPv2 and
just do this without putting it in LLDP and then you'd have
something that's already standardized. You could say, we're taking
this reserve field and we're using it for the flags. And the host
should not specify an IGP algorithm. -
Tony P: The MACsec is a huge security gap that you won't be able to
close for practical purposes. Hope IEEE catches you while you L3
truck dumping into L2 and then we'll see how it goes.
-
15:40
YANG Data Model for SRv6 Next Hop of Route
https://datatracker.ietf.org/doc/draft-lin-rtgwg-srv6-nexthop-yang/Yisong Liu (10 mins)
- Kamran: There's already an SRv6 YANG base draft in SRPING. Some of
the work in this draft perhaps can continue in that document.
- 15:45
Enhanced ECMP for AI Cluster
https://datatracker.ietf.org/doc/draft-cheng-rtgwg-enhanced-ecmp/
Weiqiang Cheng (10 mins)
- Jeff T: You're assuming even low distribution on egress which is
true for traditional dense transformer models but is not true for
MOE. Suggest to look into scenarios where egress load is uneven and
burst. Send an email to the working group list and ask people to
read and comment.
- 15:55
SR based Loop-free implementation
https://datatracker.ietf.org/doc/draft-deng-rtgwg-sr-loop-free/
Lijie Deng (10 mins)
- Sasha: Whether you have compared this draft with RFC 5715 which
contains a detailed description of multiple scenarios and multiple
loop avoidance techniques. - Lijie: Our draft describes different scenarios and some optional
implementation methods. - Zafar: Similar draft in RTGWG on uloop that didn't progress, and why
documenting internal behaviors. - Xuesong: That is the question for the whole WG. This document is an
informational document about implementation but we think it's
beneficial to have some document about the detailed implementation
for this topic. - Acee: If other document didn't progress, there's probably a reason
and we don't want to combine it if keep this one progressing. The
question should be whether we want to work on micro loops.
-
16:05
use cases、requirements and framework for implementing lossless
techniques in Wide Area Networks
https://datatracker.ietf.org/doc/draft-hs-rtgwg-wan-lossless-uc
https://datatracker.ietf.org/doc/draft-han-rtgwg-codeployment-pfc-fgfc/https://datatracker.ietf.org/doc/draft-ruan-spring-priority-flow-control-sid/
https://datatracker.ietf.org/doc/draft-hs-rtgwg-wan-lossless-framework/
Ran Pang (10 mins)
- Yuval Shavitt: Shouldn't this be dealt at layer 4 or application
layer ? - Ran: The scenario is about the wide area networking having big RTT,
if you want to solve via the application layer, it will be a large
latency.
- 16:15
Fully Adaptive Routing Ethernet in Scale-Up Networks
https://datatracker.ietf.org/doc/draft-xu-rtgwg-fare-in-sun/
Xiaohu Xu (10 mins)
- Jeff T: The draft needs to provide more technical details.
Everything presented here will work without any changes with
technologies in IDR using link bandwidth. It's unclear what's needed
on top of what we already doing. Need to update the draft. - Xiaohu: The first step is to establish BGP-based FARE in IDR. Once
it's done maybe it can be extended to the scale-up networks. RTGWG
is a better place to do this work because there's no need to do
further modification to protocols. - Jeffrey Zhang: What does the bandwidth mean, the available bandwidth
in real time or the port capacity. - Xiaohu: We just use the link bandwidth, not available or remain
bandwidth. Adaptive routing means we can distribute traffic over
different paths according to the bandwidth. It don't matter whether
it's available bandwidth or the link bandwidth.
(No time for remain presentations)
-
16:25
Multicast usage in LLM MoE
https://datatracker.ietf.org/doc/draft-zhang-rtgwg-llmmoe-multicast/Zheng Zhang (10 mins)
If time permits:
Kademlia-directed ID-based Routing Architecture (KIRA)
https://datatracker.ietf.org/doc/draft-bless-rtgwg-kira/
Roland Bless (10 mins)
From Chat:
Jeffrey Haas
00:23:03
At one point, ops WGs outside of routing were hesitant to take YANG
modeling work and this was at least one reason things got pushed to
routing area. Are the other areas picking up modeling work? (signed,
routing-mostly-person...)
Jeffrey Haas
00:45:59
unsolicited bfd is a reasonable fit for the vrrp use case, at least when
the sessions are potentially restricted to known interfaces running
vrrp.
Jeffrey Haas
00:47:13
https://datatracker.ietf.org/doc/rfc9468/
Aditya Dogra
00:54:15
Thanks Jeff, will take a look at the unsolicited bfd.
Yingzhen Qu
00:55:13
Collective note taking: https://notes.ietf.org/notes-ietf-123-rtgwg?both
Jeffrey Haas
00:56:16
fwiw unsolicited bfd may not be listed as "supported" by a number of
platforms, but any place you find bfd for static routes, it's very close
to unsolicited bfd under the covers.
Jeffrey Haas
01:00:54
To Sasha's point about partial support for p2mp bfd for vrrp, if one
host doesn't support it, it will rely on the standard failure time for
VRRP and the rest of the deployment running its failure state machine.
Jeffrey Haas
01:04:08
arguably if your implementation of vrrp can support fast timers, you'd
not bother with bfd. The only motivation is if your vrrp implementation
scales worse than your bfd.
David Lamparter
01:08:56
(my comment queueing is for the end of the presentation)
Tom Hill
01:13:16
What was wrong with RIPng?
Zhaohui Zhang
01:14:20
or DHCP?
David Lamparter
01:14:59
good question. I guess it doesn't carry all the details like IGP
algorithm & metric or A/L flags
Zhaohui Zhang
01:15:57
But we're talking about extensions ...
Jeffrey Haas
01:16:25
Tom Hill said:
What was wrong with RIPng?
Count to infinity fun if generally talking about rip. This is
effectively link scoped RIP.
Ketan Talaulikar
01:16:58
Algo is for redistribution into IGP flexible algo
David Lamparter
01:17:12
"less-than-link" scoped RIP with the special LLDP ethernet multicast
group
David Lamparter
01:17:41
(01:80:C2:00:00:0E, not L2 forwarded by any device)
Jeffrey Haas
01:17:46
It'd be lovely to understand the use case better to know why we're not
just pushing this into ICMP
Ketan Talaulikar
01:17:53
If the host need latency in the fabric then this enables it further in
the fabric
Joel Halpern
01:18:30
The current habit of the IETF abusing IEEE protocols seems a bad idea.
We don't like other folks abusing our protocols.
David Lamparter
01:19:09
+1 to Joel
Jeffrey Haas
01:19:22
We do at least have a sandbox for LLDP abuse with an IETF registered
code point. But dump trucking someone else's playground requires care.
Gyan Mishra
01:28:49
There are other options for distribution of SRv6 locators however I
think LLDP as it is used by all hosts universally it is seems like a
simple lightweight add to LLDP and a good solution. The use case for
hosts in the DC is for SRv6 cases where DC fabric is being extended to
the host. As we are just distributing an IPv6 address it does not seem
to be abusing LLDP. Thank you
Stephane Litkowski
01:30:49
BGP extensions should require IDR draft , no ? :)
Jeffrey Haas
01:33:01
It certainly wouldn't be unprecedented that IDR finds out about BGP
draft work that starts and mostly stays elsewhere until it's ready to
ship
Yingzhen Qu
01:34:47
I asked the same quesiton to the authors. They wanted to introduce the
idea in RTGWG first before talking encoding in IDR.
Jeffrey Haas
01:35:02
LLDP is great if you're just announcing a thing for discovery. (See the
D in the acronym.) Anything that involves liveness rather pushes the use
case. Hence my comment about BFD for these sorta-static routes.
Gyan Mishra
01:37:00
@Jeff Haas I believe AFAIK the issue with VRRP is that it requires full
mesh of sessions complexity overhead and this those two drafts providing
possible solution. Even if the VRRP implementation has fast timers the
full mesh of sessions is a lot of overhead which these two drafts
provide an alternative.
The benefit with P2MP BFD is the churn of electing a new VRRP active
promoting where p2mp BFD the tails are monitoring the active VRRP node
and with the lower overhead can provide optimal convergenc, Thanks
Jeffrey Haas
01:38:48
Note, @Greg Mirsky , that I didn't say the bfd use case may not be
helpful. Only that if you already scale fine, BFD becomes unnecessary.
The p2p vs. p2mp drafts start looking differently attractive based on
scale of VRRP speakers.
Daniel Bernier
01:38:52
@jeffrey agreed LLDP is not liveness type of protocol, nothing prevents
use of unsolicited BFD one TLV has been received to validate
availability, has I explained there is further updates coming.
Jeffrey Haas
01:40:19
@Daniel Bernier The BFD comment is mostly offered sarcastically. I'd not
suggest LLDP be used for this for all of the many reasons @David
Lamparter articulated.
Daniel Bernier
01:40:46
@joel, personally I think IETF should not work in a vacuum ... if there
is an already widely deployed protocol implementation that can still be
extended in IETF ... why not use it rather than force another approach.
Daniel Bernier
01:41:40
@jeffrey ... David provided great comments which we will answer, revisit
in the ML (MACSEC being one)
Himanshu Shah
01:42:28
There is a wide implementation of the bashandy uloop avoidance
Jeffrey Haas
01:42:50
@Daniel Bernier You may wish to talk to the IETF liaison to IEEE for
part of your answer.
Himanshu Shah
01:42:53
as well as deployments. So we do not need two drafts and two solutions
Himanshu Shah
01:44:06
That was comment against @acee
Jeff Tantsura
01:45:23
IEEE 802.1: The current IETF liaison manager is János Farkas (contact:
ieee-8021-liaison@ietf.org)
Daniel Bernier
01:47:16
as per use case, think the draft was clear enough but can detail
further, in most DC use cases at the moment (leaving AI out for the
moment) ... pushing IGP or BGP variations to generally advertise a
single prefix (VTEP, POD space or per host locator space) I would assume
the simplest method would be preferred.
Most implementations (again) would still leverage LLDP for topology
awareness even if using BGP.
John Scudder
01:50:48
It’s unclear to me how the present talk (lossless) isn’t in detnet’s
remit.
John Scudder
01:50:54
“The Deterministic Networking (DetNet) Working Group focuses on
deterministic data paths that operate over Layer 2 bridged and Layer 3
routed segments, where such paths can provide bounds on reordering,
latency, loss, and packet delay variation (jitter), and high
reliability.”
Daniel Bernier
01:53:41
build evpn-mh 1st, then spin a worker automated cluster / host build +
then convert to LAG (oops I cannot do BGP then).
Ok now, let's do ISIS (no no will bust scale ... need multi-instance).
Ok let's do static-bfd ... oops need to know the destination to reach
Ok let's do BGP ... hmm who's my peer ? can I do unumbered ? what's the
policy to apply ? do I or EBGP ? can I auto-discover ? can use LL ?