Fully Adaptive Routing Ethernet using BGP
draft-xu-idr-fare-02
| Document | Type |
This is an older version of an Internet-Draft whose latest revision state is "Active".
Expired & archived
|
|
|---|---|---|---|
| Authors | Xiaohu Xu , Shraddha Hegde , Zongying He , Junjie Wang , Hongyi Huang , Qingliang Zhang , Hang Wu , Yadong Liu , Yinben Xia , Peilong Wang , Tiezheng | ||
| Last updated | 2025-03-17 (Latest revision 2024-09-01) | ||
| RFC stream | (None) | ||
| Formats | |||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | Expired | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
This Internet-Draft is no longer active. A copy of the expired Internet-Draft is available in these formats:
Abstract
Large language models (LLMs) like ChatGPT have become increasingly popular in recent years due to their impressive performance in various natural language processing tasks. These models are built by training deep neural networks on massive amounts of text data, often consisting of billions or even trillions of parameters. However, the training process for these models can be extremely resource- intensive, requiring the deployment of thousands or even tens of thousands of GPUs in a single AI training cluster. Therefore, three- stage or even five-stage CLOS networks are commonly adopted for AI networks. The non-blocking nature of the network become increasingly critical for large-scale AI models. Therefore, adaptive routing is necessary to dynamically distribute the traffic to the same destination over multiple equal-cost paths, based on the network capacity and even congestion information along those paths.
Authors
Xiaohu Xu
Shraddha Hegde
Zongying He
Junjie Wang
Hongyi Huang
Qingliang Zhang
Hang Wu
Yadong Liu
Yinben Xia
Peilong Wang
Tiezheng
(Note: The e-mail addresses provided for the authors of this Internet-Draft may no longer be valid.)