Skip to main content

Minutes IETF118: rtgwg: Tue 08:30
minutes-118-rtgwg-202311070830-00

Meeting Minutes Routing Area Working Group (rtgwg) WG
Date and time 2023-11-07 08:30
Title Minutes IETF118: rtgwg: Tue 08:30
State Active
Other versions markdown
Last updated 2023-11-18

minutes-118-rtgwg-202311070830-00

IETF 118 RTGWG Minutes

Chairs:
Jeff Tantsura (jefftant.ietf@gmail.com)
Yingzhen Qu (yingzhen.ietf@gmail.com)

WG Page: https://datatracker.ietf.org/group/rtgwg/about/
Materials: https://datatracker.ietf.org/meeting/118/session/rtgwg

9:30-11:30 - Tuesday Session I, November 7, 2023

9:30
Meeting Administrivia and WG Update
Chairs (10 mins)

9:40
Discussion of draft-ietf-rtgwg-segment-routing-ti-lfa
Stewart Byrant (15 mins)

  • Ahmed Bashandy: I don't see reasons we should combine the two
    drafts.
  • Stewart: We should not suggest any deployment of FRR without micro
    loop avoidance.
  • Ahmed: This has been deployed for years. This is not a consensus.
  • Bruno: I agree with Ahmed. There are 2 different problems and
    request 2 different solutions. I don't understand why you mix them
    together.
  • Stewart: In TI-LFA, they are together. It requests further
    clarification. We need strong words for micro-loop avoidance
    mechanisms.
  • Bruno: We have a document about the uloop and it has been sent to
    the list. Please give comments to the document. I disagree many
    points. Especially we have deployed FRR methods many years and I
    don't think this is not safe. And uloop is not caused by Ti-LFA
  • Stewart: I have personal concerns since I see how easy to see
    microloops, such as in ring network. But we should give more
    explanations.
  • Bruno: how about the text we proposed?
  • Stewart: I'll look again.
  • Bruno: I disagree that it's unsafe to deploy ti-lfa.
  • Stewart: there are people worrying about the label stack size.
  • Bruno: for a link failure, it's guaranteed to be less than 2 labels.
  • Stewart: indeed. We should provide more guidance.
  • Peter: I agree with Bruno. Micro loop is not caused by fast reroute.
    It is caused by IGP convergence. There are many mechanisms for micro
    loop avoidance.
  • Stewart: we should consider adding a operation consideration
    section.
  • Jeff T: suggest to cooperate, especially with the shepherd.
  • Yingzhen: There will be a side meeting to discuss further.

Multi-segment SD-WAN via Cloud DCs
https://datatracker.ietf.org/doc/draft-dmk-rtgwg-multisegment-sdwan/
Linda Dunbar (5 mins)

  • Jeff T: Happy to see the progress and will start to review the
    document (considering WG adoption)

  • From the chat:
    Daniel Bernier09:58
    @Linda why is this draft standards track and not informational ?
    Linda: Because there are IANA registration needed for the proposed
    TLVs and subTLVs. Please see the IANA Section in the draft.
    Antoine Fressancourt
    @Linda Dunbar when and where will the draft be discussed from
    security perspective ? I didn't get it

Security Considerations for Tenant ID and Similar Fields
https://datatracker.ietf.org/doc/draft-eastlake-secdispatch-tenantid-consid/

Donald Eastlake (10 mins)

  • Jeff: what's your goal of this document?
  • Donald: Other documents could refer to this document for security
    consideration, and make it easy for other documents.
  • Jeff: That will make it a Normative rather than informal reference?
  • Donald: that depends.
  • Jeff: If you want to make a normative reference, it needs to be a
    standards track document.
  • Donald: the way it's written it means to be informational.

Problem Statement, Use Cases, and Requirements of Hierarchical SFC
with Segment Routing

https://datatracker.ietf.org/doc/draft-nh-sr-hsfc-usecases-requirements/

Kangkang Ni. (10 mins)

  • Greg Mirsky: The document states that segment routing based SFC has
    advantage over NSH based SFC. how do you vision packet will go to
    service function? will your proposal work with the test packet?
  • Kangkang: Let's communicate more on the list.
  • Greg: Which identity for Service function forwarder?
  • Kangkang: Maybe discuss in the list.
  • Greg: suggest to refer to the document of SFC architecture to go
    align with the terminologies.
  • Jeff T: Please send the comments to the list.
  • Robin: OAM should be considered when discussing SFC

Reliability Framework for SRv6 Service Function Chaining
https://datatracker.ietf.org/doc/draft-yang-rtgwg-srv6-sfc-reliability-framework/

Feng Yang / Changwang Lin (10 mins)

  • Greg: Dual-homing SF case, what mechanism you envision to detect
    failure?
  • Feng: BFD will be fine.
  • Greg: How to differ 2 types of failures?
  • Feng: Treat these 2 the same
  • Greg: one of the cases may cause routing loop
  • Feng: that's right. there should be some syncup mechanism to let the
    two SF get the same behavior.
  • Greg: so your schema gets more complex.
  • Dongyu Yuan: there are two methods of protection. is it possible to
    combine them together?
  • Feng: Good idea. We should consider in later version

Kademlia-directed ID-based Routing Architecture (KIRA)
https://www.ietf.org/archive/id/draft-bless-rtgwg-kira-00.txt
Roland Bless (10 mins)

  • Tony Przygienda: Are you proposing a new dataplane?
  • Roland: you can use normal IP forwarding, you can use existing data
    plane forwarding mechanisms like longest prefix match. It was
    designed for forwarding control plane traffic, so data plane routing
    is separate.
  • Antoine Fressancourt: Difference to Babel?
  • Roland: It's IP based. It is more scalable and was designed mainly
    for the control plane, because it does not always use the shortest
    path routes.
  • Juliusz Chroboczek: Ad-hoc community has been doing this for many
    years. DHT-based routing mechanisms did not work very well there,
    leading to loops. You should come.
  • Roland: Kira was not designed for Ad-Hoc environment. I am aware of
    the ad-hoc work.
  • Jeff Tantsura: KIRA paper is behind a paywall.
  • Roland: (fixed at https://s.kit.edu/KIRA) you can download a
    preprint version here
    https://publikationen.bibliothek.kit.edu/1000148953

Service ID for Addressing and Networking
https://datatracker.ietf.org/doc/draft-huang-rtgwg-sid-for-networking/
Daniel Huang (10 mins)

  • Gao Fang: On slide #7, who publish the service id? About the control
    system, is it the one for the network or cloud? for multicloud, how
    to maintain sync among multiple clouds?
  • Daniel: Service ID is not solution. The service ID will be governed
    by centralized control system.
  • Jeff T: please send your questions to the list.
  • XueSong: confusing what is the relationship between service ID and
    APN ID or ROSA?
  • Daniel: Service ID is lightweight label on data plane. It is an
    unified identification among terminal, network and cloud. APN ID is
    only for limited domain inside the network.
  • Linda: is the service ID an address?
  • Daniel: No, the Service ID is an ID carried in the packets, such as
    FlowID in IPv6 header.
  • Linda: why don't you use flow id?
  • Daniel: it might be. I don't want to focus on the specific solution
    yet.
  • Xuesong: The assumption that APN ID is limited inside the domain is
    based the previous community discussion, it is not the limitation of
    APN itself. We have presentation about application side APN work in
    this session.
  • Daniel: Have read the document and could have further discussion

Reliability in AI Networks Gap Analysis, Problem Statement, and
Requirements

https://datatracker.ietf.org/doc/draft-cheng-rtgwg-ai-network-reliability-problem/

Chengweiqiang / Changwang Lin (10 mins)

  • Jeff T: are you calling for solutions?
  • ChengWeiQiang: we are just giving problem statements without
    solutions. hope people can propose solutions for the identified
    problems.
  • Jeff T: Over the years, there have been many solutions to address
    the problems you have identified. Please go back to select those
    solutions. For the numbers you mentioned, it will be helpful to
    provide some references.
  • Juliusz: why do you need submilsec convergence time?
  • WeiQiang: If you can control switch over time under 100ms, the AI
    training can continue rather than go back to some check point even
    restart the training.
  • Juliusz: I will take it off line.
  • Victor: Agree to Jeff and would like to see more discussion about
    requirement and the analysis why it is, especially those numbers.
  • Jeff: We will have a side meeting.
  • Yunping: Different solutions for different topologies, like fat
    tree/dragonfly?
  • Weiqiang: Unified solution for all the topologies

Coordinated Congestion Management
https://datatracker.ietf.org/doc/draft-lyu-rtgwg-coordinated-cm/
Yunping(Lily) Lyu (15 mins)

  • Jeff T: you are assuming there are flows. With adaptive routing,
    there might not be flows, but packets. many assumptions in the draft
    are not valid.
  • Lily: we can have more discussions online.
  • DongYu Yuan: similar to infiniband. should inherit some from there?
    Any differences between IP layer ECN and L4 congestion control?
  • Lily: yes. The key point is that we need to coordinate congestion
    control and adaptive routing.
  • Linda D: can you use ECN RFC6040 to achieve what you want to do?
  • Yily: we can use ECN. but we need to coordinate when to trigger
    congestion control and when to use Adaptive routing.