Skip to main content

Routing mechanism in Dragonfly Networks Gap Analysis, Problem Statement, and Requirements
draft-wang-rtgwg-dragonfly-routing-problem-01

Document Type Active Internet-Draft (individual)
Authors Ruixue Wang , Changwang Lin , wangwenxuan , Weiqiang Cheng
Last updated 2024-03-03
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-wang-rtgwg-dragonfly-routing-problem-01
RTGWG Working Group                                             R. Wang
Internet Draft                                             China Mobile
Intended status: Informational                                   C. Lin
Expires: September 3,2024                          New H3C Technologies
                                                                W. Wang
                                                           China Mobile
                                                               W. Cheng
                                                                                                     China Mobile
                                                          March 3, 2024

       Routing mechanism in Dragonfly Networks Gap Analysis, Problem
                        Statement, and Requirements
              draft-wang-rtgwg-dragonfly-routing-problem-01

Abstract

   This document provides the gap analysis of existing routing
   mechanism in dragonfly networks, describes the fundamental problems,
   and defines the requirements for technical improvements.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF). Note that other groups may also distribute
   working documents as Internet-Drafts. The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 3 2024.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document. Code Components extracted from this

wang, et al.          Expire September 3, 2024                  [Page 1]
Internet-Draft    Dragonfly Routing Problem Statement       March 2024

   document must include Simplified BSD License text as described in
   Section 4.e of the Trust Legal Provisions and are provided without
   warranty as described in the Simplified BSD License.

Table of Contents

   1. Introduction...................................................3
      1.1. Requirements Language.....................................3
      1.2. Terminology...............................................3
   2. Existing Mechanisms............................................4
      2.1. Basic Topology............................................4
      2.2. Routing mechanisms in Dragonfly network...................5
   3. Gap Analysis...................................................6
      3.1. Load In balance...........................................6
      3.2. Adaptive Routing Notifications............................6
   4. Problem Statement..............................................8
   5. Requirements for Dragonfly network Mechanisms..................8
   6. Security Considerations........................................9
   7. IANA Considerations............................................9
   8. References....................................................10
      8.1. Normative References.....................................10
      8.2. Informative References...................................10
   Authors' Addresses...............................................11

cheng, et al.        Expires September 3, 2024                [Page 2]
Internet-Draft    Dragonfly Routing Problem Statement       March 2024

1. Introduction

   Dragonfly network is a type of high-performance computer
   interconnection network architecture that is commonly used in large-
   scale computing environments. It consists of a collection of
   interconnected groups, with each group containing several computing
   resources such as processors, storage devices, and nodes. The nodes
   within each group communicate with each other using a high-speed
   local network, while the groups themselves are connected through a
   global network. Dragonfly networks are designed to provide high
   bandwidth and low latency communication capabilities, making them
   ideal for applications that require large-scale data processing and
   intensive computing tasks. Overall, dragonfly networks offer a
   scalable, efficient, and flexible solution for connecting hundreds
   or even thousands of computing resources in a parallel computing
   environment.

1.1. Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

1.2. Terminology

   Group: In a group, multiple nodes are organized into a physical
   topology structure and interconnected by a high-speed network.

   Inter-group link: Link connecting different groups.

   Routing: The path or strategy that data packets take to transmit
   through the network.

   Topology: The physical and logical layout structure of the network.
   Dragonfly network is a type of topology.

   Routing algorithm: The algorithm that determines the path or
   strategy for data packets to transmit through the network.

   Congestion control: When there is too much traffic in the network,
   adjusting the transmission rate and routing method, etc., to avoid
   network congestion.

   MR : Minimal Routing

   NMR Non-Minimal Routing

wang, et al.        Expires September 3, 2024                [Page 3]
Internet-Draft    Dragonfly Routing Problem Statement       March 2024

   AR: Adaptive Routing

   VLB: Valiant Load-Balanced Routing

2. Existing Mechanisms

2.1. Basic Topology

       N N N N N N    N N N N N N     N N N N N N
       | | | | | |    | | | | | |     | | | | | |
      ++-+-+-+-+-++  ++-+-+-+-+-++   ++-+-+-+-+-++
      |     G1    |  |     G2    |...|    G8     |
      +-+---+----++  ++----+----++   ++---+----+-+
        |   |    |    |    |    |     |   |    |
        |   |    +----+    |    +-----+   |    |
        |   +--------------)--------------+    |
      +-+------------------+-------------------+-+
      |   +------------------------------+       |
      |   |                              |   G0  |
      | +-+-+          +---+           +-+-+     |
      | |R0 +----------+R1 +-----------+ R2|     |
      | ++-++          ++-++           ++-++     |
      |  | |            | |             | |      |
      +--)-)------------)-)-------------)-)------+
         | |            | |             | |
         N N            N N             N N
                  Figure 1: DragonFly network diagram

   In the DragonFly network shown in Figure 1, there are a total of 9
   groups, with each group consisting of 3 routers (G). Each router is
   connected to 2 nodes (N). The groups in the DragonFly network are
   connected through inter-group links. The routers within each group,
   as well as between routers and nodes, are connected through high-
   speed links within the group.

   For data communication within a group, it is typically sufficient to
   forward traffic only through the links within the group. For data
   communication between groups, traffic needs to be forwarded through
   both the links within each group and the inter-group links. The
   specific path selection in the Dragonfly network is typically
   determined by the routing protocol used in the network. The routing
   protocol is responsible for dynamically determining the best path
   for data packets to travel from the source to the destination.

   Various topologies can be used to form the intra-group
   connectivity.A typical intra-group topology is a fully connected

wang, et al.        Expires September 3, 2024                [Page 4]
Internet-Draft    Dragonfly Routing Problem Statement       March 2024

   graphwhere all switches are directly connected to each other. An
   exampleof such an intra-group topology is shown in the G0 group in
   Figure 1. The intra-group connectivity in the Cascade architectureis
   a 2-dimensional all-to-all mesh.

2.2. Routing mechanisms in Dragonfly network

   This section briefly introduces the existing routing mechanisms in
   dragonfly networks. Dragonfly networks use several routing
   mechanisms, each with its own advantages and disadvantages. Here are
   some brief overviews of several common routing mechanisms:

   o Minimal Routing is the simplest and most commonly used routing
      mechanism in Dragonfly networks. It uses the path with the least
      number of channels to quickly deliver data to the destination
      node. The advantage of MR is that it is easy to implement and has
      low latency. However, since MR only focuses on the path with the
      least number of channels, the risk of load imbalance is
      relatively high.

   o Non-Minimal Routing is a routing mechanism that avoids load
      imbalance by choosing a path other than the one with the least
      number of channels. The advantage of NMR is that the routing
      algorithm is intelligent and flexible, able to balance the load
      of network communication and reduce latency. However, NMR is more
      complex, requiring more computational resources and communication
      overhead.

   o Adaptive Routing is a mechanism that can dynamically adjust the
      routing path by intelligently judging the network congestion
      status. AR's strengths lie in its adaptability, which can control
      traffic in high-load situations and prevent congestion. The
      disadvantage is that its implementation is complex and requires
      more sophisticated algorithms and computational resources.

   o Valiant Load-Balanced Routing uses the classic Valiant algorithm
      to select paths between global routing networks and then uses
      load-balancing routing algorithms between each group. The
      advantage of VLB is that it can achieve load balancing across the
      network range. The disadvantage is that it is complex, requiring
      more resources and computational costs.

   Overall, the choice of routing mechanism in dragonfly networks
   requires a balance between performance, cost, and other factors and
   depends on specific application scenarios and requirements.

wang, et al.        Expires September 3, 2024                [Page 5]
Internet-Draft    Dragonfly Routing Problem Statement       March 2024

3. Gap Analysis

3.1. Load In balance

   When the Dragonfly network routes through the minimum-route
   mechanism, the problem of load imbalance is easy to occur because
   the routing path is fixed and the communication volume between
   different groups in Dragonfly network may not be the same. When load
   imbalance occurs, the group with larger communication volume may be
   overly congested, affecting the overall performance of the network.
   We need a load balancing mechanism that can distribute the load
   between optimal and non-optimal links to avoid congestion on the
   main link.

   There are several load balancing mechanisms that can achieve this
   goal. One common approach is to use a combination of Equal-Cost
   Multi-Path routing and Link Aggregation. ECMP distributes the
   traffic across multiple paths based on their cost, while Link
   Aggregation combines multiple physical links into a single logical
   link to increase bandwidth and provide redundancy.

   However, non-minimum-route mechanism is required to calculate the
   distance of all possible paths, which requires more communication
   and computational resources, and cannot completely avoid the problem
   of load imbalance.

   Adaptive routing mechanism can dynamically adjust the routing path
   according to the network congestion situation, making the network
   more adaptable to different traffic. However, the computational cost
   of this mechanism is high, and it occupies some of the bandwidth of
   the network, which may affect the performance of applications.

   Valiant load-balancing routing algorithm can achieve load balancing
   across the entire network, but its design and implementation are
   complex and require more computing and communication resources.
   Although it can improve routing reliability and fault tolerance, it
   may not be necessary to adopt this mechanism in small-scale
   networks.

   Due to the random network topology used in the Dragonfly network,
   the distance between each node internally is random, which may cause
   some unnecessary redundancies in routing and affect routing
   efficiency.

3.2. Adaptive Routing Notifications

   The dynamic adjustment of flow paths based on the load situation is
   a traffic scheduling algorithm that can dynamically choose the

wang, et al.        Expires September 3, 2024                [Page 6]
Internet-Draft    Dragonfly Routing Problem Statement       March 2024

   optimal flow path based on the load of nodes (or links) in the
   network to transmit data packets.

   This algorithm usually uses two techniques: one is based on traffic
   measurement, and the other is based on protocol exchange between
   routers or switches. The traffic measurement-based technology
   measures the traffic in the network using network analysis tools or
   dedicated hardware embedded in routers or switches. Once some nodes
   or links with high loads are detected, the flow path can be
   automatically adjusted to alleviate the load. On the other hand, the
   protocol exchange-based technology relies on protocol communication
   between routers or switches to obtain the current network topology
   and node load data, and flow paths can be adjusted accordingly based
   on this information.

   In this way, network administrators can ensure that there is always
   the best data flow path at any time, thereby maximizing network
   performance, reducing latency, and avoiding network congestion. At
   the same time, dynamic adjustment of flow paths can also provide
   robustness to the network, enabling it to automatically adapt to
   adverse events such as changes in network topology and node
   failures.

   Whether based on traffic testing or protocol exchange between
   routers or switches, devices need to be able to communicate the
   current network performance in a quantitative manner.

   Traffic testing technology requires the use of network analysis
   tools or dedicated hardware embedded in routers or switches to
   measure traffic in the network and obtain information about node
   load. These node load data needs to be translated into digital data
   and sent to the control plane through protocols or interfaces such
   as SNMP (Simple Network Management Protocol), Netflow, etc.

   Protocol exchange technology uses protocol communication between
   routers or switches, such as OpenFlow, IS-IS, etc., to obtain the
   current network topology and node load information through the
   control plane. These information are often encoded into digital
   formats and transmitted to the operation plane through network
   transmission protocols.

   Adaptive routing notifications are a communication protocol used to
   relay routing information and network load in a network. These
   notifications can be messages between nodes or between switches and
   routers.

   In the Dragonfly network, adaptive routing notifications are
   utilized to implement adaptive routing mechanisms. When the network

wang, et al.        Expires September 3, 2024                [Page 7]
Internet-Draft    Dragonfly Routing Problem Statement       March 2024

   load reaches a certain level, nodes and switches use notifications
   to dynamically choose routing paths. For example, during network
   congestion, switches and routers send notifications to prompt nodes
   to redirect traffic to different ports or nodes. These notifications
   can also include other information about network congestion and load
   balancing, such as bandwidth usage, device load and performance, and
   traffic rates.

   The benefits of using adaptive routing notifications in the
   Dragonfly network are that they enable real-time adjustments of
   routing paths for nodes and switches, avoiding congestion and
   improving network performance. Additionally, adaptive routing
   notifications help network administrators identify and resolve
   network issues more easily, such as pinpointing congestion points
   and routing bottlenecks.

   In summary, adaptive routing notifications play a significant role
   in the Dragonfly network and are a crucial component in implementing
   adaptive routing mechanisms.

   Regardless of the approach, communication between devices needs to
   be standardized and routinized to achieve self-adaptation and
   interoperability across devices. Standardized and routinized
   communication between devices is critical to building adaptive
   networks.

4. Problem Statement

   The current problem with the Dragonfly network is the lack of a
   concise and effective routing protocol for load balancing between
   optimal and non-optimal links.

   Another problem is that for dynamic load balancing, it is necessary
   to standardize how network performance is quantified and
   communicated in a quantitative manner. This requires
   standardization.

5. Requirements for Dragonfly network Mechanisms

   In the Dragonfly architecture, the routing protocol is a crucial
   component that guides packet transmission and route selection. Here
   are several aspects that the routing protocol in the Dragonfly
   architecture requires:

   *  Low latency: Low latency is essential in the Dragonfly
     architecture. Therefore, the routing protocol must be fast and

wang, et al.        Expires September 3, 2024                [Page 8]
Internet-Draft    Dragonfly Routing Problem Statement       March 2024

     efficient to ensure that packets are transmitted to the
     destination node promptly.

   *  Load balancing: Load balancing is important in the Dragonfly
     architecture, and the routing protocol needs to support multiple
     available paths for load balancing. The routing protocol should
     dynamically select among multiple available paths to ensure fast
     packet transmission and distribute the load across network
     connections.

   *  Scalability: The Dragonfly architecture is typically deployed at
     large scale with a large number of nodes communicating with each
     other. Hence, the routing protocol needs to be scalable and
     capable of supporting route selection and packet transmission
     among a large number of nodes.

   *  Adaptability: The network topology in the Dragonfly architecture
     can change over time. The routing protocol needs to be adaptive
     and capable of re-computing optimal paths when the network
     topology changes, ensuring the selection of the best path for
     packet transmission.

   *  Reliability: The routing protocol in the Dragonfly architecture
     needs to ensure packet reliability. It should support link failure
     detection and recovery to ensure that packets can be correctly
     transmitted to the destination node in the event of link failures.

   In summary, the routing protocol is a critical component in the
   Dragonfly architecture, requiring support for low latency, load
   balancing, scalability, adaptability, and reliability. Only with
   these requirements fulfilled can the routing protocol reliably
   operate in the Dragonfly architecture and provide efficient support
   for network communication.

6. Security Considerations

   TBD.

7. IANA Considerations

This document does not request any IANA allocations.

wang, et al.        Expires September 3, 2024                [Page 9]
Internet-Draft    Dragonfly Routing Problem Statement       March 2024

8. References

8.1. Normative References

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate

             Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC

              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,

              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

8.2. Informative References

   TBD

wang, et al.        Expires September 3, 2024               [Page 10]
Internet-Draft    Dragonfly Routing Problem Statement       March 2024

Authors' Addresses

   Ruixue Wang
   China Mobile
   China

   Email: wangruixue@chinamobile.com

   Changwang Lin
   New H3C Technologies
   China

   Email: linchangwang.04414@h3c.com

   Wenxuan Wang
   China Mobile
   China

   Email: wangwenxuan@chinamobile.com
   
   
   Weiqiang Cheng
   China Mobile
   China

   Email: chengweiqiang@chinamobile.com

wang, et al.        Expires September 3, 2024               [Page 11]