An Open Congestion Control Architecture with network cooperation for RDMA Fabric
draft-zhh-tsvwg-open-architecture-00

Document Type Active Internet-Draft (individual)
Last updated 2019-07-05
Stream (None)
Intended RFC status (None)
Formats plain text xml pdf htmlized bibtex
Stream Stream state (No stream defined)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date
Responsible AD (None)
Send notices to (None)
TSVWG                                                          Y. Zhuang
Internet-Draft                                                  R. Huang
Intended status: Informational             Huawei Technologies Co., Ltd.
Expires: January 5, 2020                                    July 4, 2019

  An Open Congestion Control Architecture with network cooperation for
                              RDMA Fabric
                  draft-zhh-tsvwg-open-architecture-00

Abstract

   This document describes an open congestion control architecture with
   network cooperation (including network proactive and passive control)
   for high performance RDMA fabric to provide low latency and high
   throughput for datacenter applications such as the AI computing.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 5, 2020.

Copyright Notice

   Copyright (c) 2019 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Zhuang & Huang           Expires January 5, 2020                [Page 1]
Internet-Draft              Open architecture                  July 2019

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions . . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  Abbreviations . . . . . . . . . . . . . . . . . . . . . . . .   3
   4.  Design Principle for high performance RDMA fabric . . . . . .   3
   5.  Architecture Overview . . . . . . . . . . . . . . . . . . . .   4
     5.1.  Roles and Functionalities . . . . . . . . . . . . . . . .   6
       5.1.1.  Sender NIC  . . . . . . . . . . . . . . . . . . . . .   6
       5.1.2.  Switch  . . . . . . . . . . . . . . . . . . . . . . .   6
       5.1.3.  Receiver NIC  . . . . . . . . . . . . . . . . . . . .   6
     5.2.  Interfaces  . . . . . . . . . . . . . . . . . . . . . . .   7
       5.2.1.  NIC interfaces  . . . . . . . . . . . . . . . . . . .   8
       5.2.2.  Network interface . . . . . . . . . . . . . . . . . .   8
   6.  Compatibility Consideration . . . . . . . . . . . . . . . . .   9
     6.1.  Negotiate the congestion control capability . . . . . . .   9
     6.2.  Co-exist with current NIC to NIC control channel  . . . .   9
   7.  Security Considerations . . . . . . . . . . . . . . . . . . .   9
   8.  Manageability Consideration . . . . . . . . . . . . . . . . .  10
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  10
   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  10
     10.1.  Normative References . . . . . . . . . . . . . . . . . .  10
     10.2.  Informative References . . . . . . . . . . . . . . . . .  10
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  11

1.  Introduction

   Traditionally, RDMA (Remote Direct Memory Access) is running over the
   closed and expensive InfiniBand (IB) [IB] networks.  However, due to
   the limitation of network scalability and high costs of IB, RDMA
   traffic is moving to IP/Ethernet as its underlay networks for better
   scale and low cost.  Supporting RDMA over IP/Ethernet using lower
   price NICs and Switches with reduced latency is important for low
   latency and high throughput datacenter applications such as AI
   Computing.

   As such, the datacenter networks (DCNs) nowadays is not only
   providing traffic transmission for tenants using TCP/IP network
   protocol stack, but also is required to provide RDMA traffic for High
   Performance Computing (HPC) and distributed storage accessing
   applications which requires low latency and high throughput.  With
   that said, there are more stringent requirements for basic
   performance of DCN.

   [Requirement] discusses major problems of current RDMA fabric
   technologies and the requirements for better performance.  Also,
Show full document text