Skip to main content

Minutes IETF121: rtgwg: Tue 09:30
minutes-121-rtgwg-202411050930-00

Meeting Minutes Routing Area Working Group (rtgwg) WG
Date and time 2024-11-05 09:30
Title Minutes IETF121: rtgwg: Tue 09:30
State Active
Other versions markdown
Last updated 2024-11-12

minutes-121-rtgwg-202411050930-00

IETF 121 RTGWG Meeting Minutes

Chairs:
Jeff Tantsura (jefftant.ietf@gmail.com)
Yingzhen Qu (yingzhen.ietf@gmail.com)

WG Page: https://datatracker.ietf.org/group/rtgwg/about/
Materials: https://datatracker.ietf.org/meeting/121/session/rtgwg

##

9:30-11:30 - Tuesday Session I, Nov 5th, 2024

=========================================================

  1. 9:30
    Meeting Administrivia and WG Update
    RTGWG Charter Update
    Chairs (15 mins)
  • No comment.
  1. 9:45
    TI-LFA, BGP-PIC and SR ULoop
    https://datatracker.ietf.org/doc/draft-ietf-rtgwg-segment-routing-ti-lfa/

    https://datatracker.ietf.org/doc/draft-ietf-rtgwg-bgp-pic/
    https://datatracker.ietf.org/doc/draft-bashandy-rtgwg-segment-routing-uloop/

    Ahmed Bashandy (15 mins)

[draft-ietf-rtgwg-segment-routing-ti-lfa]

  • Jim Guichard: it is not clear. John has a point. We need to have a
    phone offline with John to clear some confusions. The confusion is
    about whether something (related to the word "key") is mandatory or
    not. It might be clear to the author who has worked on the document
    for a long time, but new readers can be confused.
  • Ahmed: do you have exact suggestions on the changes?
  • Stewart: when it was presented, it was stressed the repair path
    should be on post convergence path. If this changed, you need to let
    the WG aware of why Key is changed. it should be documented.
  • Ahmed: it says its important, but you don't have to follow it.
  • Stewart: the thesis said it was important. You need to explain why
    the fundamental of the design changed.
  • Ahmed: some platforms couldn't do it, I thought it was obvious.
  • Stewart: we need to have a longer conversation.
  • Ahmed: ok, I'm available.
  • Sasha: Agreed with Stewart. lots of history behind the changes.
  • Jeff: do you feel we need another discussion?
  • Ahmed: The original idea was to use the post convergence path. This
    has been dropped and changed from MUST to SHOULD. I am open to
    either way, moved back to mandatory, or why it's not mandatory. I
    understand Stewart and Sasha's comments.
  • Jeff: it is clear that WG has feedback, please address them.

[draft-ietf-rtgwg-bgp-pic]

  • YingZhen: if there is a term already defined in a RFC. Please use
    the term instead of creating a new one.
  • Ahmed: I am describing forwarding behavior, not BGP. The term in the
    RFC is not quite right for our case.
  • YingZhen: there are comments on the list stating some terms are
    already specified in existing RFCs. Let's take this offline.
  • Jim Guichard: please send me the list of terms you think different
    from the existing RFCs.

[draft-bashandy-rtgwg-segment-routing-uloop]

  • Yingzhen: People have asked for this draft. Do we really have to
    wait until you close the other two?
  • Ahmed: Have no time. As soon as possible.
  • Yingzhen: Promise to help with terminology.
  • Jeff: Going back and forth arguing about terminology is not time
    well spent.
  1. 10:00
    SR based Loop-free implementation
    https://datatracker.ietf.org/doc/draft-deng-rtgwg-sr-loop-free/
    Lijie Deng (10 mins)

    • Peter Psenak: what is the purpose of this draft? it is local
      decision, has nothing to do with interoperability.
    • Lijie: it is important to document different microloop
      scenarios.
    • Jeff: the draft doesn't introduce anything new. Most describe
      the operation issue. It is informational. Please reference the
      existing documents to make the document useful.
    • Yingzhen: for example. how did you decide those timer values?
    • Sasha: Look up RFC 5715 for framework convergence. Not sure why
      this draft is needed.
  2. 10:10
    Path-aware Remote Protection Framework
    https://datatracker.ietf.org/doc/draft-liu-rtgwg-path-aware-remote-protection/

    Yisong Liu / Changwang Lin (10 min)

  • Jeff: 1. Suggest explanation on why more than one hop to transverse;
    1. Most networks have 3 layers, not 2, describe it; 3. Not clear
      about the correlation between router ID and next hop, maybe another
      layer needed.
  • Changwang: revise in the next version.

  • Maria:why use IPv4 public address? Are you reinventing the wheel? it
    doesn't improve BGP performance.

  • Peter: second Jeff's comments.
  • Changwang: it's limited to spine-leaf topology. We'll update the
    draft accordingly.
  • AiJun: why not use BGP? why select BGP independent protocol to
    reflect failure?
  • Changwang: the current protocol can't provide the short protection
    time. It should be protocol independent.
  • Jeff Haas: 3 comments with regard to Next Hop address. We could use
    the next-hop address, the choice to use router-id, there might be
    mapping needed. 2nd to Maria, BGP is not a fast converging protocol.
    3rd, the router info draft that will be presented later is the local
    detection mechanism.
  • Alexander Azimov: we already have protocol stack for fast re-route.
    Why need a new one.
  • Jeff: it is over the top path.
  1. 10:20
    Destination/Source Routing
    https://datatracker.ietf.org/doc/draft-ietf-rtgwg-dst-src-routing-revive/

    Shu Yang (5 mins)

  • Jeff: it is an important work.
  1. 10:25
    Deep Collaboration between Application and Network
    https://datatracker.ietf.org/doc/draft-zhang-rtgwg-collaboration-app-net/

    Xinxin Yi (10 mins)

  • Aijun: Interesting. Discuss more in side meeting tonight about
    expand the capability of network to cloud and other application.
  • Daneil Huang: About massive data transmission, clarify that it is
    more than host and network, application is also related.
  • Jeff: As chair, we have seen those similar work for last 2 years. If
    you want to use the WG time, please provides the details if you want
    to present for 2nd time.
  1. 10:35
    The Challenges and Requirements for Routing in Computing Cluster
    network

    https://datatracker.ietf.org/doc/draft-li-rtgwg-computing-network-routing/

    Yizhou Li or Fengkai Li (10 mins)

  • Tony Li: have you considered simply using the Link state protocol?
  • Yizhou: the traditional distributed routing is good, but require
    heavy configuration. This can be overcome by central controlled
    routing. Hybrid routing fix the problem of heavy configuration.
  • Tony: the IGP doesn't have the complexity of configuration as BGP.
    The faults you mentioned are introduced by BGP.
  • Yizhou: are you suggesting IGP?
  • Tony Li: some of these problems you mentioned are caused by BGP. so
    maybe you should consider IGP.
  • Yizhou: Cover IGP case in the next version.
  • Jeff T: there is comment from Jeff Haas, you may consider LSVR.
  1. 10:45
    In-Network Congestion Notification
    https://datatracker.ietf.org/doc/draft-du-rtgwg-in-network-congestion-notification/

    Zongpeng Du (10 mins)
    * No comment.

  2. 10:55
    Adaptive Routing Framework
    https://datatracker.ietf.org/doc/draft-cheng-rtgwg-adaptive-routing-framework/

    Changwang Lin/Rui Zhuang

  • Aijun: What's your consideration about avoiding the traffic
    congestion?
  • Yingzhen: Please take it to the mailing list.
  1. 11:05
    Generalized IPv6 Tunnel (GIP6)
    https://datatracker.ietf.org/doc/draft-li-rtgwg-gip6-protocol-ext-requirements/

    https://datatracker.ietf.org/doc/draft-li-rtgwg-generalized-ipv6-tunnel/04/

    Xinxin Yi/Zhenbin Lin/Qiangzhou Gao (10 mins)

  • No comment.
  1. 11:15
    Advertising Router Information
    https://datatracker.ietf.org/doc/draft-zzhang-rtgwg-router-info/
    Jeffrey Zhang (5 mins)
  • Yingzhen: The use of word "flooding" should be reconsidered.

=======================================================================

Chat
Yingzhen Qu
00:03:38

good morning.

Jiaming Ye
00:13:53

Dear hosts and chairs, I am Zhuang Rui from CMCC. There were some issues
with my account, so I borrowed a colleague's account to give this
speech. When it's my turn to give the speech, please allow me to use
this account to speak and turn the page of the PowerPoint slides. Thank
you very much.

Yingzhen Qu
00:15:24

ok. when it's your turn, please just speak up, so we know it's you

Yingzhen Qu
00:15:55

please help with the note taking:
https://notes.ietf.org/notes-ietf-121-rtgwg?both

Jiaming Ye
00:16:25

Thank you very much!

Juliusz Chroboczek
00:21:15

How do I put a video in full screen with the new Meetecho UI?

Lorenzo Miniero
00:21:54

Juliusz: if you hover over the video, there are some icons that allow
you to enlarge it

David Lamparter
00:21:57

arrow icon button top-left when you hover on the video

Juliusz Chroboczek
00:22:17

Thanks, David.

Juliusz Chroboczek
00:22:35

... and Lorenzo.

Jeffrey Haas
00:25:59

@Ahmed Bashandy If you think Alvaro is out of bounds on the termniology,
feel free to also ask idr-chairs@ietf.org for opinion.

Jeffrey Haas
00:26:27

BGP terminology spans more than just the core BGP RFC.

Aijun Wang
00:39:34

Sasa, your voice is incontinuous, would you like to raise the question
on the chat?

Aijun Wang
00:41:27

@Peter, this draft describes the different scenarios that may existing
within the network due to the temporay loop(mirco-loop), it can
certainly gives the operator awareness of the possilbe failure and
correponding solutions

Abdussalam Baryun
00:50:01

@Yingzhen, good morning

Tobias Fiebig
00:59:03

@Jeff well, there is this attempt to at least list some currently used
terminology, trying to get adopted by GROW. ;-)

Jeffrey Haas
00:59:45

An effort greatly appreciated, @tobias. As we'd discussed eventually
getting some of this normalized all the way up to the rfc editor will be
a good thign.

Tobias Fiebig
01:00:55

Yeah, I think that will be a long path. I will already be happy if we
get a somewhat semi-exhaustive list of terms that are currently in use
without any claim to being authoritative.

Jeffrey Haas
01:01:05

If I'm understanding Jeff and Alexander correctly, the issue with having
tcp address the issue, some of the drop/rebalance scenarios are too
short lived to be able to be addressed with adjusting the flow labels at
the head end. That's effectively a form of global "repair"

Jeff Tantsura
01:02:40

@Jeff, you are correct, this has been deployed for both, TCP and UDP,
but indeed, orthogonal to networking convergence (it might however
interact in very unpleasant ways :))

Jeffrey Haas
01:03:30

I think that some of the confusions in the presentations are due to a
need for more clarity about the duration of the events that are being
mitigated.

Jeffrey Haas
01:04:07

Things that can be dealt with via global mechanisms, including poking
new entropy into the headend... do that. But that doesn't solve many
short lived problems.

Jeff Tantsura
01:04:49

@Jeff agree, timing of events is not well described

Peng Liu
01:20:22

some of the ability has been discussed, some of the ability has the
related existing working group. My suggestion is to focus on fewer use
cases and analysis deeply, you may get some new specific points.

Tom Hill
01:20:40

Hear hear, Jeff T

Jeffrey Haas
01:27:54

I wonder if the speaker is familiar with LSVR

Jeffrey Haas
01:33:00

The critique of BGP vs. config is fair. Similarly, so is the fact that
IGPs will have a "shake the topology" result when link state is updated
and needs to be flooded. That said, see prior works about constrained
flooding.

Juliusz Chroboczek
01:34:27

Probably very naive question, but concerning BGP config, why isn't it
simply a matter of providing sensible defaults in implementations?

Jeff Tantsura
01:34:48

I'd argue that "extensive config" issues has long been solved

Jeffrey Haas
01:35:16

I agree with Jeff. The config complexity exists, and is usually
addressed via hiding the complexity using templates.

Tom Hill
01:35:31

At a certain point, stability in a network is a layer-0 problem. Choices
are made that no network protocol can mitigate against the effects of.

Yingzhen Qu
01:35:49

@meetecho, the screen that the presenter can see doesn't show the timer.

Jeff Tantsura
01:35:49

+1 Tom

Jeffrey Haas
01:36:06

There is some work previously done overlapping BGP autoconfiguration in
IDR that would likely address many of the "boring" Clos topology
scenarios. That work, sadly, did not progress.

Juliusz Chroboczek
01:36:27

Jeffrey, link?

Adrian Farrel
01:36:39

The authors would be well advised to look at
https://datatracker.ietf.org/doc/html/rfc2386#section-9.4

Adrian Farrel
01:37:01

(Maybe that is what Jeff H refers to

Lorenzo Miniero
01:37:48

@Yingzhen Qu you're right, I think they see the video agent and not the
screen agent on that screen: the screen is on the other side. We'll ask
the AV team to swap their content

Jeffrey Haas
01:38:11

https://wiki.ietf.org/group/idr/BGPAutoconfiguration - this covers the
drafts. The usage of these for addressing easy fabric was mostly "future
work"

Jeffrey Haas
01:39:02

That wiki misses this one:
https://datatracker.ietf.org/doc/html/draft-minto-idr-bgp-autodiscovery/

Jeff Tantsura
01:39:04

FRR (most used BGP implementation in DC) requres only internal vs
external definition + int_name to augment LL

Juliusz Chroboczek
01:39:07

Thanks Jeffrey, I'll have a look at my leisure.

Tom Hill
01:40:30

I was thinking that, Jeff T. It was readily automatable in ansible/chef,
etc

Tom Hill
01:40:41

(Last I looked)

Juliusz Chroboczek
01:40:50

Adrian, RFC 9616 does the opposite of the section you cited :-)

Lorenzo Miniero
01:44:29

@Yingzhen Qu the AV team told us it's swapped, can you confirm the
presenters can see it now?

Yingzhen Qu
01:44:45

yes. thanks!

Joel Halpern
01:47:11

Regarding RFC 9616, note that Babel has the unusual property that no
matter what you do to the metricss, it never loops. And I believe that
RFC aims at link delay response, not congestion response.

Joel Halpern
01:47:59

@Julius, I know you know the difference, but I think your reference may
well confuse other people.

Juliusz Chroboczek
01:49:22

Joel, agreed. I was just adding a footnote to Adrian's mention of RFC
2386.

Juliusz Chroboczek
01:50:09

Re link delay vs. congestion, I expect the oscillation issues to be very
similar.

Joel Halpern
01:51:14

Experrience indicates that the oscillation of congestionr esponse is
VERY different from the response to relatively stable (not fixed, but
not changing with the traffic) link delay. COnflating the two produces
disasters.

Joel Halpern
01:51:52

From where I sit, any proposal to perform distributed congestion
response needs to explain why and how it deals with the issue s from RFC
2386.

Joel Halpern
01:52:27

Maybe it is possible to finesse the problems (Babel at least changes
some of them) but it is not obvious or clear.

Jeff Tantsura
01:53:49

note that in the current DC networks, when congestion is signaled by the
routers (ECN marking, INT, etc), the action is always taken by the end
host. If both (hosts and infra) take actions independently, this can
result in rather unhealthy interactions

Juliusz Chroboczek
01:54:05

Joel, I fully agree.

Joel Halpern
01:54:39

@Jeff, yes, that is another old result from control theory that needs to
be kept in mind.

Jeff Tantsura
01:54:49

yep