Skip to main content

Adaptive Routing Notification
draft-wh-rtgwg-adaptive-routing-arn-05

Document Type Expired Internet-Draft (individual)
Expired & archived
Authors Haibo Wang , Xuesong Geng , Xiaohu Xu , Yinben Xia , Jie Dong , Hongyi Huang
Last updated 2026-06-11 (Latest revision 2025-12-08)
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state Expired
Telechat date (None)
Responsible AD (None)
Send notices to (None)

This Internet-Draft is no longer active. A copy of the expired Internet-Draft is available in these formats:

Abstract

Large-scale supercomputing and AI data centers utilize multipath to implement load balancing and/or improve transport reliability. Adaptive routing (AR), widely used in direct topologies such as dragonfly, is growing popular in commodity data centers to dynamically adjust routing policies based on path congestion and failures. When congestion or failure occurs, the sensing node can not only apply AR locally but also send the congestion/failure information to other nodes in a timely and accurate manner to enforce AR on other nodes, thus avoiding exacerbating congestion on the reported path. This document specifies Adaptive Routing Notification (ARN), a general mechanism to proactively disseminate congestion detection and congestion elimination information for remote nodes to perform re-routing policies. Particularly for AI workloads like DeepSeek's MoE models that exhibit dynamic all-to-all communication patterns with bursty traffic characteristics, such mechanisms become crucial to enable immediate network response to transient congestion conflicts.

Authors

Haibo Wang
Xuesong Geng
Xiaohu Xu
Yinben Xia
Jie Dong
Hongyi Huang

(Note: The e-mail addresses provided for the authors of this Internet-Draft may no longer be valid.)