[{"author": "Ketan Talaulikar", "text": "

Nice to see the chairs in the spotlight ;-)

", "time": "2024-03-18T23:30:38Z"}, {"author": "Yingzhen Qu", "text": "

please help with collective note taking: https://notes.ietf.org/notes-ietf-119-rtgwg?both

", "time": "2024-03-18T23:32:14Z"}, {"author": "Andrew Alston", "text": "

Adoption calls still have to go to the list don't they?

", "time": "2024-03-18T23:44:50Z"}, {"author": "Acee Lindem", "text": "

Should go to INT Area

", "time": "2024-03-18T23:45:39Z"}, {"author": "Yingzhen Qu", "text": "

@Andrew, yes, the adoption call will go to the list.

", "time": "2024-03-18T23:50:58Z"}, {"author": "Himanshu Shah", "text": "

On Path aware remote protection - I believe the author is proposing a scheme to handle the remote failure with a path aware backup path already programmed in the FIB.

", "time": "2024-03-19T00:02:35Z"}, {"author": "David Lamparter", "text": "

@Antoine didn't catch your question for the notes either, sorry

", "time": "2024-03-19T00:02:50Z"}, {"author": "Himanshu Shah", "text": "

As soon as the notification arrives, switch over happens. The whole goal is to reduce the service outage instead of waiting for BGP withdraw..

", "time": "2024-03-19T00:03:22Z"}, {"author": "Himanshu Shah", "text": "

The switchover scheme is not yet proposed.

", "time": "2024-03-19T00:03:38Z"}, {"author": "Antoine Fressancourt", "text": "

@David My question is about the selection of the remote repair node. Is it a self election mechanism from receiving a failure notification ? If a node tries to repair a path, does it stop the upstream relay of the failure notification ? Can two remote node be repairing a path in parallel?

", "time": "2024-03-19T00:06:45Z"}, {"author": "Himanshu Shah", "text": "

Sorry i meant \"notification scheme\" is not yet proposed.

", "time": "2024-03-19T00:07:18Z"}, {"author": "John Scudder", "text": "

Did the person at the mic really say this solution would provide microsecond scale repair?

", "time": "2024-03-19T00:07:43Z"}, {"author": "John Scudder", "text": "

Ain\u2019t no way.

", "time": "2024-03-19T00:07:50Z"}, {"author": "Himanshu Shah", "text": "

I agree - has to be milliseconds depending on what the notification scheme is..

", "time": "2024-03-19T00:08:28Z"}, {"author": "Weiqiang Cheng", "text": "

https://datatracker.ietf.org/doc/draft-cheng-rtgwg-ai-network-reliability-problem/

", "time": "2024-03-19T00:13:09Z"}, {"author": "Weiqiang Cheng", "text": "

This draft gives some analysis on requirements of fast protection in AI DC

", "time": "2024-03-19T00:13:39Z"}, {"author": "Yingzhen Qu", "text": "

@meetecho, please switch the camera to the presenter

", "time": "2024-03-19T00:15:33Z"}, {"author": "Lorenzo Miniero", "text": "

Done!

", "time": "2024-03-19T00:15:47Z"}, {"author": "Yingzhen Qu", "text": "

thank you

", "time": "2024-03-19T00:15:59Z"}, {"author": "Antoine Fressancourt", "text": "

@Weiqiang thanks for the link

", "time": "2024-03-19T00:17:37Z"}, {"author": "David Lamparter", "text": "

the queue is still stuck with people from the previous draft, can we clear that?

", "time": "2024-03-19T00:17:51Z"}, {"author": "Weiqiang Cheng", "text": "

@ John, I mentioned the requirement is sub-millisecondes even us. I don't think the solution will provide the us scale. But it is valuable to look for the way to improve the recovery time.

", "time": "2024-03-19T00:19:09Z"}, {"author": "Himanshu Shah", "text": "

@weiqiang - the proposal reminds me of AIS/RDI type of scheme.. :-)

", "time": "2024-03-19T00:20:28Z"}, {"author": "Adrian Farrel", "text": "

Looks like a pretty picture to me

", "time": "2024-03-19T00:23:35Z"}, {"author": "Dave Phelan", "text": "

Wasn\u2019t drawn on a napkin.

", "time": "2024-03-19T00:24:24Z"}, {"author": "Christopher Hawker", "text": "

Another pretty picture!

", "time": "2024-03-19T00:25:47Z"}, {"author": "Jeff Tantsura", "text": "

@Jeff haas - on fast recovery topic- you have mentioned other drafts that are being progressed in other eg's - please share the draft names

", "time": "2024-03-19T00:31:50Z"}, {"author": "Acee Lindem", "text": "

Is it just me or doesn't the Meetecho private chat work?

", "time": "2024-03-19T00:36:47Z"}, {"author": "Christian Hopps", "text": "

is the presentation over? the questioners seem to be assuming that

", "time": "2024-03-19T00:37:08Z"}, {"author": "Christian Hopps", "text": "

can we finish the presentation first?

", "time": "2024-03-19T00:37:27Z"}, {"author": "Shukri Abdallah", "text": "

Do you specify an algorithm for selecting orbits that form a stripe?

", "time": "2024-03-19T00:37:58Z"}, {"author": "Christopher Hawker", "text": "

@Acee just tried sending you a private chat.

", "time": "2024-03-19T00:38:29Z"}, {"author": "Lorenzo Miniero", "text": "

Acee: what do you mean by doesn't work? I use it regularly when I need to get in touch with people in rooms, e.g., to provide assistante to remote speakers

", "time": "2024-03-19T00:38:47Z"}, {"author": "Lorenzo Miniero", "text": "

You can try contacting me privately here too, if you want to test

", "time": "2024-03-19T00:39:42Z"}, {"author": "Adrian Farrel", "text": "

@lorenzo, it is not immediately obvious how to do it from the chat window without starting up zulip

", "time": "2024-03-19T00:40:19Z"}, {"author": "Lorenzo Miniero", "text": "

Adrian: you need to click on the name of the participant from the participants list, and options will appear. The balloon icon will open a private chat

", "time": "2024-03-19T00:41:01Z"}, {"author": "Adrian Farrel", "text": "

Yup, but not the participants name in the chat window :-)

", "time": "2024-03-19T00:41:24Z"}, {"author": "Lorenzo Miniero", "text": "

Ah no, that's correct: those names are not clickable

", "time": "2024-03-19T00:42:05Z"}, {"author": "Dawei Fan", "text": "

off-stripe forwarding seems find short path, my question is how about the propogation in this architecture. it is the same as current IGP(ISIS)?

", "time": "2024-03-19T00:42:58Z"}, {"author": "Yisong Liu", "text": "

@John, we hope to provide a solution for millisecond repair. but it may need more work on the specific solution and we'll continue to do that

", "time": "2024-03-19T00:43:17Z"}, {"author": "Andrew Stone", "text": "

Fair to say TE goals are mainly for user stations, to TE path from gateway to satellite is what's critical and TE from satelite to gateway is not ?

", "time": "2024-03-19T00:50:19Z"}, {"author": "Yisong Liu", "text": "

@Jeff Haas, please help to confirm do you refer to this draft: https://datatracker.ietf.org/doc/draft-wang-idr-next-next-hop-nodes/

", "time": "2024-03-19T00:50:37Z"}, {"author": "Andrew Stone", "text": "

** for user downstream traffic

", "time": "2024-03-19T00:50:41Z"}, {"author": "Tony Przygienda", "text": "

well, from experience, satellite guys do -really- like their proprietary stuff ;-)

", "time": "2024-03-19T00:56:25Z"}, {"author": "Adrian Farrel", "text": "

Quote Jeff: The satellite work is operating in a vacuum

", "time": "2024-03-19T00:57:25Z"}, {"author": "Christian Hopps", "text": "

that's fantastic

", "time": "2024-03-19T00:58:18Z"}, {"author": "Tom Hill", "text": "

Tony, I had some queries about approach, but I'll pick it up in a coffee break. More musing on past efforts.

", "time": "2024-03-19T00:59:10Z"}, {"author": "Tony Przygienda", "text": "

well, security through obscurity is a concept. Plus, interop is only interesting if you look to shop vendors. Most routier vendors are not particularly good satellite builders and though building a router is not easy, putting a satellite into orbit is of different scale ...

", "time": "2024-03-19T00:59:20Z"}, {"author": "Tony Przygienda", "text": "

I doubt culturally the term \"router scientiscist\" will ever achieve the same level of awe when mentioned as \"rocket scientiscist\" ;-)

", "time": "2024-03-19T01:00:45Z"}, {"author": "Christian Hopps", "text": "

Why should this be in the network? I just don't get it. What servers a client uses should not be part of the routing database.

", "time": "2024-03-19T01:04:45Z"}, {"author": "Andrew Alston", "text": "

This is also outside of charter - APN is a large topic - and for RTGWG to take on any larger topic - it has to be explicitly chartered to do so - as per the current charter

", "time": "2024-03-19T01:05:20Z"}, {"author": "Dmitry Afanasiev", "text": "

Also, there probably going to be only few LEO constellations operating at any given time, very likely interconnected only via ground gateways - so there is little incentive to bother with standardization

", "time": "2024-03-19T01:05:23Z"}, {"author": "Tony Przygienda", "text": "

@Dima, well, with the amunt of space junk Elon and AMZN are shooting up there and grid arrays I'm not sure that is true anymore ...

", "time": "2024-03-19T01:06:19Z"}, {"author": "Changwang Lin", "text": "

@Antoine,
\nThere are two types of notification mechanisms for:

\n
    \n
  1. \n

    Proactively notify.\u00a0The suggestion is to only protect the two-level network and only notify upwards once.
    \n.

    \n
  2. \n
  3. \n

    Flow triggered notification.\u00a0Send notifications to the direction of the flow when it is perceived that it cannot be forwarded.\u00a0This method can notify upstream.\u00a0If there is no protection path upstream, subsequent traffic will trigger notification to the higher-level device again.
    \nRemote nodes can simultaneously repair a remote path fault

    \n
  4. \n
\n

The remote path aware document needs to address several issues:

\n
    \n
  1. In a specific topology, convergence does not depend on the control surface protocol.
  2. \n
  3. Control surface protocols, such as BGP, extend support to add remote path information on the next hop of the route.
  4. \n
  5. Fault perception: perceived by the remote end, and then notified to other protocols to quickly notify the remote end.
  6. \n
  7. Switching process: It does not rely on the control plane and completes fast switching on the forwarding plane, achieving microsecond level convergence.
  8. \n
", "time": "2024-03-19T01:06:51Z"}, {"author": "Christian Hopps", "text": "

Write a protocol for applications to choose servers don't try and wedge this into routing.

", "time": "2024-03-19T01:07:08Z"}, {"author": "Andrew Alston", "text": "

+! Christian

", "time": "2024-03-19T01:07:28Z"}, {"author": "Dmitry Afanasiev", "text": "

but problem is certainly interesting and it seems it can be solved with available tools + reasonable amount of tweaking and without too big sacrifices

", "time": "2024-03-19T01:07:33Z"}, {"author": "David Lamparter", "text": "

did the slides just disappear or is it just me?

", "time": "2024-03-19T01:07:55Z"}, {"author": "Tony Przygienda", "text": "

is there a preso after that? Otherwise it's 2aM+ here and bed would be nice instead of suffering through this stuff ...

", "time": "2024-03-19T01:07:56Z"}, {"author": "David Lamparter", "text": "

(nevermind, back now)

", "time": "2024-03-19T01:08:17Z"}, {"author": "Christian Hopps", "text": "

Tony this stuff is progressing b/c lots of people dislike it so much they are ignoring it.

", "time": "2024-03-19T01:08:30Z"}, {"author": "David Lamparter", "text": "

we are at \u2192
\n10:30
\nExtension of Application-aware Networking (APN) Framework for Application Side

\n

(it's 11:10 now, so >30min over)
\n10:40
\nApplication-aware Data Center Network (APDN) Use Cases and Requirements

\n

10:50
\nUse Cases and Requirements for Implementing Lossless Techniques in Wide Area Networks

\n

11:00
\nUse Cases-Standalone Service ID in Routing Network

", "time": "2024-03-19T01:09:34Z"}, {"author": "Tony Przygienda", "text": "

well, RFCs are pretty clear what you do with wtuff after 2 failed BOFs. But of course some grownups have to apply the prcedural framework ...

", "time": "2024-03-19T01:09:38Z"}, {"author": "Andrew Alston", "text": "

Well - its failed 2 BOF's if I recall - and the IESG wouldn't approve their proposed charter - and now it's being shopped - but as I said - its explicitly out of charter

", "time": "2024-03-19T01:09:46Z"}, {"author": "Dmitry Afanasiev", "text": "

@Tony - number of sats is big, no doubt, and it is growing fast, but as for systems - it's just 2, maybe 3-4 more of comparable scale will come up later, but that's it

", "time": "2024-03-19T01:10:21Z"}, {"author": "Jeff Tantsura", "text": "

I'm here

", "time": "2024-03-19T01:10:37Z"}, {"author": "Tony Przygienda", "text": "

@David: ok, looks like I get some snooze time back and couple hours sleep before morning meetings get me up again ...

", "time": "2024-03-19T01:10:48Z"}, {"author": "David Lamparter", "text": "

:)

", "time": "2024-03-19T01:11:32Z"}, {"author": "Tony Przygienda", "text": "

@Jeff, yepp, probably but I doubt you'll ever grow up (and we love you for it ;-)

", "time": "2024-03-19T01:11:57Z"}, {"author": "David Lamparter", "text": "

@Jeff: mic queue is still locked from previous draft btw

", "time": "2024-03-19T01:12:13Z"}, {"author": "Yingzhen Qu", "text": "

@David, thanks for the reminder

", "time": "2024-03-19T01:12:31Z"}, {"author": "Jeff Tantsura", "text": "

;-)

", "time": "2024-03-19T01:14:33Z"}, {"author": "Tony Przygienda", "text": "

well, I disagree with the premise that network substrate needs to understand the application semantics ...

", "time": "2024-03-19T01:19:48Z"}, {"author": "Andrew Alston", "text": "

You are not alone in that Tony

", "time": "2024-03-19T01:20:12Z"}, {"author": "Dmitry Afanasiev", "text": "

+1

", "time": "2024-03-19T01:20:19Z"}, {"author": "Tony Przygienda", "text": "

as @Dima once said: it's all distributed linear algebara at the end ;-) and this does not know whether you're computing tensor cross-section to calculate static stability or train some generative parrot

", "time": "2024-03-19T01:21:06Z"}, {"author": "Dmitry Afanasiev", "text": "

but collective ops is a special beast, HPC interconnects historically provided support for it, at least some of them

", "time": "2024-03-19T01:21:15Z"}, {"author": "David Lamparter", "text": "

anyone know how this relates to coinrg?

", "time": "2024-03-19T01:21:33Z"}, {"author": "Tony Przygienda", "text": "

noithing against in-substrate support for folding of course ...

", "time": "2024-03-19T01:21:43Z"}, {"author": "Dmitry Afanasiev", "text": "

@David collective operations - e.g all-reduce, used in ML training, doing intermediate reduction in network can improve performance. Definitely computation in the network.

", "time": "2024-03-19T01:24:06Z"}, {"author": "Jeff Tantsura", "text": "

SHARP is doing this today

", "time": "2024-03-19T01:26:00Z"}, {"author": "Dmitry Afanasiev", "text": "

@Jeff exactly, it's a very good example

", "time": "2024-03-19T01:26:25Z"}, {"author": "Tony Przygienda", "text": "

thinking that to the end you basically want S-I-PMSI capable on the substrate setting up such hierarchical folding \"trees\" and that's one of the things I tried to talk to folks about as BIER use case (think generalized sharp distribution substrate ;-)

", "time": "2024-03-19T01:26:57Z"}, {"author": "Dmitry Afanasiev", "text": "

but no SHARP for Eth/IP .. at least yet :)

", "time": "2024-03-19T01:27:03Z"}, {"author": "Tony Przygienda", "text": "

unfortunately, the folks dealing with taht are religiously against any type of multicast (for which I have some understanding)

", "time": "2024-03-19T01:27:27Z"}, {"author": "Kehan Yao", "text": "

@Dmitry, agree. collective operations offloaded to the switch is a common solution for AI/HPC. So for AI networking, maybe people shouldn't wear glasses, and some in-network behavior maybe helpful.

", "time": "2024-03-19T01:27:57Z"}, {"author": "John Scudder", "text": "

I\u2019m not able to stay for this talk on \u201clossless techniques\u201c but I want to point out to anyone who isn\u2019t aware that it sure sounds the same as what detnet works on. \ufffc

", "time": "2024-03-19T01:27:59Z"}, {"author": "Tony Przygienda", "text": "

@Dima: RIFT a natural for it once we manage to get the multicast folks finishe their work ;-) though of course the multicast here becomes not multicast but really distributed folding operatoin

", "time": "2024-03-19T01:28:12Z"}, {"author": "Yingzhen Qu", "text": "

@John, noted.

", "time": "2024-03-19T01:28:51Z"}, {"author": "Jeff Tantsura", "text": "

multicast (or tree building at all) is the least of your problem, ASIC doing reduction at line rate is though

", "time": "2024-03-19T01:28:51Z"}, {"author": "Tony Przygienda", "text": "

ASICs are easy, they build a new one every 6 months ;-)

", "time": "2024-03-19T01:29:08Z"}, {"author": "Dmitry Afanasiev", "text": "

@Tony yes, establish reduction topology - this is straightforward, but also deal with data loss, reducer failures, latency bounds, maybe agreement on quantization

", "time": "2024-03-19T01:29:16Z"}, {"author": "Tony Przygienda", "text": "

@Dima, you know me, I'm a control plane whinie, the larger the scale the better ;-)

", "time": "2024-03-19T01:29:45Z"}, {"author": "Tony Przygienda", "text": "

but of course I agree, practically to get such distributed folding deliver real gainss is anything but eays at the volumes

", "time": "2024-03-19T01:30:15Z"}, {"author": "Dmitry Afanasiev", "text": "

vectors to be reduced can be quite large, so enough of buffer space on reducing switches

", "time": "2024-03-19T01:30:18Z"}, {"author": "Tony Przygienda", "text": "

oh, people complained about private messaging not working because I just discovered a long queue of little symbols at the top I ignored ;-) Cute ...

", "time": "2024-03-19T01:32:20Z"}, {"author": "Dmitry Afanasiev", "text": "

@Tony I'm with you wrt scale - the larger the better, also that's where really interesting problems start to appear, and just throwing money is not enough to make those problems go away :)

", "time": "2024-03-19T01:32:41Z"}, {"author": "Tony Przygienda", "text": "

@Dima, nah, money helps. You just start to throw it at smart people rather than brute forcing it ;-)

", "time": "2024-03-19T01:33:30Z"}, {"author": "Tony Przygienda", "text": "

okey, bed now. fun session ;-)

", "time": "2024-03-19T01:33:39Z"}]