Skip to main content

Minutes IETF120: rtgwg: Mon 20:00
minutes-120-rtgwg-202407222000-00

Meeting Minutes Routing Area Working Group (rtgwg) WG
Date and time 2024-07-22 20:00
Title Minutes IETF120: rtgwg: Mon 20:00
State Active
Other versions markdown
Last updated 2024-08-05

minutes-120-rtgwg-202407222000-00

IETF 120 RTGWG Minutes

Chairs:
Jeff Tantsura (jefftant.ietf@gmail.com)
Yingzhen Qu (yingzhen.ietf@gmail.com)

WG Page: https://datatracker.ietf.org/group/rtgwg/about/
Materials: https://datatracker.ietf.org/meeting/120/session/rtgwg

##

13:00-15:00 - Monday Session I, July 22, 2024

1. 13:00

Meeting Administrivia and WG Update
Chairs (10 mins)

3. 13:20

A YANG Data Model for the Virtual Router Redundancy Protocol (VRRP)

https://datatracker.ietf.org/doc/draft-acee-rtgwg-vrrp-rfc8347bis/
Acee Lindem (10 mins)

  • Jeff T: As a WG participant, I support this work and option #3. The
    chairs will send this to the list.

John asked a question in the chat (Sorry, John. The chairs missed the
queue.)

  • John Scudder: What I was in line to ask was if Acee would Obsolete
    the v1 vrrp model, as well. I assume so but it wasn’t on the slide.

4. 13:30

YANG Data Model for ARP and IPv6 Address Resolution

https://datatracker.ietf.org/doc/draft-ietf-rtgwg-arp-yang-model/
https://datatracker.ietf.org/doc/draft-zhang-rtgwg-ipv6-address-resolution-yang/ 

Fan Zhang (10 mins)

  • Yingzhen: The ARP YANG model was a WG doc and expired. We have a new
    editor for the draft. Also a new draft for IPv6 address resolution
    is added by the authors.

6. 13:50

Application-Responsive Network Framework

https://datatracker.ietf.org/doc/draft-yang-rtgwg-arn-framework/
Feng Yang (10 mins)

  • Acee Lindem: what does 2B/2C/2H mean?
  • Yangfeng: 2B means to for example enterprise. it means "to business
    users""to mobile users""to residential users" respectively
  • Yingzhen: There are proposals on different IDs. We need to figure
    out whether the use cases are applicable, then we need to
    consolidate the APN ID/ARN ID/Service ID raised in RTGWG.
  • Acee Lindem: what is the advantages of allocating ARN ID over just
    assigning a behavior to a SID? instead of a new ID in the headers.
  • Yang Feng: SRv6 is within a limited domain. we need some mechanisms
    to enable the customers to access to the networking capabilites in
    access networks.
  • Acee: are you suggesting to standardize the IDs for consistent
    behaviors?
  • Feng: Yes.
  • Joel Halpern: I understand the difficulties of classifying customer
    traffic at the edge. The presentation talks about using a new ID for
    this classification. Why is that more trustworthy than DSCP or the
    existing identifiers?
  • Feng: DSCP doesn't carry user information.
  • Joel: there are 6 bits DSCP. they're recommended meaning, not
    required. If you assume you need something to identify a service in
    your network, you can use network ID (such as MPLS tag, SRv6 ID,
    Network ID ), you don't need a network ARN ID.
  • Feng Yang: we can't use SRv6 SID because SRv6 is within limited
    domain.
  • Joel: but the ARN ID is within your limited domain.
  • Feng Yang: no. it can be encaped by the customer.
  • Joel: I thought you said the user ARN ID can be allocated by user,
    not the network ARN ID. Otherwise you're allowing customers to
    direct traffic, that gets you to a new set of problems. Why is NSH
    not trusted? because the network can't trust customers
    classifications.
  • Feng: we want cross domains.
  • Joel: you mean between operators?
  • Feng: Yes.
  • Jim Guichard: There are lot of work done by TEAS WG. Please have a
    look.
  • Jeff T: Please continue on the list.

7. 14:00

Adaptive Routing Notification for Load-balancing

https://datatracker.ietf.org/doc/html/draft-liu-rtgwg-adaptive-routing-notification

Yao Liu

  • Sasha Vainshtein: I have not heard of any loop avoidance mechanisms.
  • Yao Liu: one option is to multicast the link adjustments, so other
    nodes can adjust accordingly. How to control the multicast should be
    considered. There will be more details included in the future
    version.
  • Jeff T: You are changing route on the fly, risk of route
    oscillation.
  • Dean: this is just another QoS. why not using existing QoS
    mechanism? what are the chances of this ever get used?
  • Yao: in this doc, it can be used together with other mechanisms.
  • Jeff T: please continue the discussion on the list. Especially how
    to avoid loop and oscillation.

5. 13:40

Advertising Router Information

https://datatracker.ietf.org/doc/draft-zzhang-rtgwg-router-info/
Jeffrey Zhang

  • Antoine Fressancourt: why not using the link-state routing
    protocols?
  • Jeffrey Zhang: The notification here needs to happen quickly. And
    there are concerns of using routing protocols as dump truck.
  • Nan Geng: About flow redirection, can you explain how to use the
    sub-TLV?
  • Jeffrey Zhang: The router for some reason wants this traffic to not
    be sent to it. Whether this use case is valid or not, that can be
    discussed.
  • Acee: is the information from your neighbor? do you save the
    information, or is it stateless? In LSVR, there is L3 discover
    protocol that has all the securities already considered. This seems
    like you just tell something, and it's retained.
  • Jeffrey: it's a lot to IGP link-local flooding but very quick, maybe
    milliseconds.
  • Acee: there are proprietary BFDs, where they put stuff in it.
  • Tianji: how to recognize neighbors over the overlay path? to address
    remote flooding?
  • Jeffrey Z: We have another draft document remote flooding, and
    neighbors discovery over overlay paths. I'll publish that soon.
  • Tianji: some receiving nodes will need to be configured, right?
  • Jeffrey: yes.

Jeff T: We've been getting drafts about fast notifications for a long
time, we will consider to have an interim focusing on this issue.

8. 14:10

Adaptive Routing Framework

https://datatracker.ietf.org/doc/draft-cheng-rtgwg-adaptive-routing-framework/

Jiaming Ye

  • Dan Voyer: do you have any mechanism to trace the flow once you move
    it? I didn't see that in the document.
  • Hongyi Huang: what's the difference between adjust path and path
    load?
  • Jiaming: weight of the ECMPs can be adjusted to impact flows.
  • (scribe missed)node value is set manually ?
  • Hongyi Huang: is it configured manually?
  • Jiaming Ye: It is determined by the congestion level, which is
    determined by the congestion point.
  • Acee: There are drafts talking about per packet load balancing. This
    makes the assumption that if a very big flow, the problem is pushed
    to the application for packet re-ordering etc., then I'm not sure
    this works.
  • Jeff Tantsura: By the time you get to recognizing the flow problems,
    a great deal of the flow may be lost. I do not know of any mechanism
    that can quickly readjust those flows.

9. 14:20

Framework for Implementing Lossless Techniques in Wide Area Networks

https://datatracker.ietf.org/doc/draft-he-huang-rtgwg-wan-lossless-framework/

Tao He (10 mins)

  • Greg Mirsky: what is the difference in terms of requirements of
    bounded latency and packet loss from deterministic networking
    working group?
  • Tao He: Our solutions focus on packet loss.
  • Greg Mirksy: detnet includes packet loss. Please consider discussing
    this with the detnet group.
  • Toa He: In this solution, it limits ot packet loss.
  • Greg: Let's take it to the list.

10. 14:30

Use Cases for High-performance Wide Area Network

https://datatracker.ietf.org/doc/draft-xiong-rtgwg-use-cases-hp-wan/
Daniel Huang/Quan Xiong (10 mins)

  • Dean Bogdanovic: detnet created the RFC8578 for use cases. they are
    not focused on network technologies. Please take a look at DETNET
    documents. There are 9 vertical areas being discussed and
    expectations from the network. Also not all data are the same. Some
    data can be lost, some data can't be lost. You're not distinguish
    them. Look at this document and suggest what you can improve.
  • Daniel H: Detnet is good. However, we have to consider cost. it is
    too expensive to deploy DETNET in WAN. We're trying to balance.
  • Dean Bogdanovic: Then network cost compares to the energy costs in
    case of a catastrophic event is small.
  • Greg Mirsky: agree what you describe is similar to DETNET. The cost
    can be justified.
  • Hongyi: How do you define high-performnce here?
  • Daniel: I agree this is part of overlapping with Detnet.

Jeff T: There will be a side meeting on AIDC on Wednesday where some
similar problems will be talked about. There will be presentations about
adoption routing, congestion control as well.

The following presentations were not presented due to time limitation.

  1. 14:40
  • Jeff: I'm sorry - but we have run out of time. We will give you a
    slot at the next IETF.
  1. 14:50

If time permits:
Fast Congestion Notification Packet (CNP) in RoCEv2 Networks
https://datatracker.ietf.org/doc/draft-xiao-rtgwg-rocev2-fast-cnp/
Xiao Min

From Chat

Loa Andersson
00:01:46

volume low on your mic

Yingzhen Qu
00:18:08

collective note taking: https://notes.ietf.org/notes-ietf-120-rtgwg

Ketan Talaulikar
00:20:10

I think the option 3 is better given the motivation. Suggest calling it
ietf-vrrp3 to reflect the protocol version which obsoletes the old RFC

Jim Guichard
00:20:40

someone is in the queue

John Scudder
00:21:38

I guess there was no time for Q/A? What I was in line to ask was if Acee
would Obsolete the v1 vrrp model, as well. I assume so but it wasn’t on
the slide.

John Scudder
00:21:48

Anyway seems reasonable.

Yingzhen Qu
00:22:18

John, can you please send your question to the list?

Jim Guichard
00:33:03

@chairs please keep an eye on the queue

Yingzhen Qu
00:33:37

@Jim, sure.

David Black
00:34:03

2B = To Business?

Daniel Bernier
00:44:37

If would be interesting to validate the challenge versus BBF work with
traffic steering

Daniel Bernier
00:44:43

ie WT-474

Daniel Bernier
00:44:51

that is basically the purpose of it

Jeff Tantsura
00:45:31

Dan - could you please send an email to the list with your comment?

Daniel Bernier
00:45:44

yes I will ;-)

Joel Halpern
00:51:03

I saw no reference in this draft to RFC 2386. Nor to the hard problems
that the RFC discusses. Without that, I do not see the point of
disucssing a notification protocol.

Joel Halpern
00:51:33

(Sent to chat since the chairs do not seem to want discussion at the
microphone.)

Tony Li
00:58:49

Loop avoidance is the responsibility of your PCE

Nitsan Dolev Elfassy
01:00:32

PCE may not be able to follow the changes ...

Tony Li
01:02:26

If the changes are at too high a frequency for a PCE, then you lack a
stable control loop.

Boris Khasanov
01:02:27

If PCE gets LSDB (TED) it will but with some delay

Joel Halpern
01:03:53

The whole approach lacks a stable control loop. THe authors may or may
not have one in mind, but the congestion response draft doesn't include
it.

Liu Yao
01:08:42

@Joel so the suggestion is to add the control loop in the draft on how
to response to the congestion notification, right?

Joel Halpern
01:12:09

@liu Yao No. My suggestion is to start by explaining why the problems
identified in the mid-90s with any approach like this don't apply. Only
after those problems are recognized and discussed is there any point in
discussing mechanisms.

Nitsan Dolev Elfassy
01:12:47

The whole approach does not analyze the load of the response traffic
itself ...

Joel Halpern
01:12:58

It may be that you can find a way around those problems. But that is
where you need to start.

Liu Yao
01:14:55

Got it, will work on this ,thanks for pointing it out

David Black
01:23:08

Is this mechanism capable of re-ordering packets within a flow?

Liu Yao
01:23:15

@Nitsan generally, ARN suppression mechanism should be used accordingly
to avoid notification storm. will add the analyze on this aspect.

Tony Li
01:23:35

Not just capable...

Jeff Tantsura
01:24:53

it will cause packet reordering

David Black
01:25:08

Not good for a number of higher-layer protocols.

Tony Li
01:25:22

The only upside is that it will be rate limited by BGP convergence time.

David Black
01:25:34

Packet-based adjustment mode looks like a particular problem.

Joel Halpern
01:26:14

Moving an elephant flow does seem highly likely to simply move the
congestion, not avoid it.

Jeff Tantsura
01:26:46

in DC (10ns RTT) - BGP as the main control loop signaling is not very
suitable for that

David Black
01:26:48

And if the elephant transport stumbles due to reordering around the
move, the move appears to help ... until it doesn't.

Nitsan Dolev Elfassy
01:28:10

Should not one expect an analysis of the various congestion causing
factors before approaching with these congestion control tools ? Have we
done such analysis ?

Cheng Li
01:29:42

I would like to know how terrible caused by congestion today, before
discussing the solution

Dan Voyer
01:30:03

good point Cheng

John Scudder
01:30:40

If only there were some prior art on load-sensitive routing.

Joel Halpern
01:31:41

While I presume some of the reality has changed, RFFC 2386 does seem a
good starting point :-)

Tony Li
01:31:50

? Congestion comes from having insufficient bandwidth available to
transport the available load. This may be due to insufficient capital,
insufficient deployment, or inadquate ECMP paths. For DC purposes, a
good first order approximation is an infinite demand.

Eduard V
01:32:02

Cheng, ping me - I will explain. It is a consequence of cutting upper
layer switches from the hyper-scale DC/Cloud (replacing by mesh).

Cheng Li
01:32:31

thank you Eduard :)

Dan Voyer
01:33:01

@tony Li, it could be traffic spike - ex.: when the game industry
release a new path/upgrades it does create spike

Tony Li
01:34:45

@Jeff Tantsura I agree that any current control plane is out of the
question. One could, in principle have faster, lower-level control
loops.

Joel Halpern
01:35:34

One could congest two paths at once by using detnet preof :-)

Jeff Tantsura
01:35:44

@tli this is the topic of most today' presentations, some kind of "fast
notification"

Tony Li
01:35:47

@Dan Voyer You're absolutely right, I left out insufficient foresight
and planning.

Weiqiang Cheng
01:38:04

Jeff, I agree with you that BGP maybe not suitable, and we need some
fast solution. It is possible because the AIDC network topo is simpler

Jeff Tantsura
01:41:45

Important point to consider - perhaps transport network is not the right
place to solve the problem; end-points in most cases run some kind of
congestion control, either ECN marking based (e.g DCQCN) or timer based
(e.g. timely) - these are much faster in detection and can mitigate at
the source - by reducing TX rate or changing entropy

Weiqiang Cheng
01:41:47

@cheng li, the adaptive routing is designed for AIDC, in which the
traffic model is really different from IP WAN network. The size per flow
may be 400G for such as all-reduce traffic from only one GPU.

Cheng Li
01:44:38

it is huge, so it might be worthy to investigate :)

Tony Li
01:52:12

Shouldn't this be in QUIC and answered by same?

David Black
01:53:05

Or tsvwg fot the CC aspects. Dan's mike comment on relationship to
detnet is very much on point.

Weiqiang Cheng
01:53:07

@Dan, I don't think IP WAN network need adaptive routing. It is for AI
training data center. When you run training task, at the same time, all
the GPU involved in one all-reduce task maybe running at line rate, such
as 400Gbps. It will cause significant congestion, and then lower the
training efficiency. The traffic model is much worse than that in IP WAN
network even in worst case.

Jeff Tantsura
01:54:02

in CCWG we are progressing HPCC documents, these are INT based (switch
assisted) congestion control for AIDC