PFC-Free Low Delay Control Protocol
draft-dai-tsvwg-pfc-free-congestion-control-00

Document Type Active Internet-Draft (individual)
Last updated 2020-07-13
Stream (None)
Intended RFC status (None)
Formats plain text html xml pdf htmlized (tools) htmlized bibtex
Stream Stream state (No stream defined)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date
Responsible AD (None)
Send notices to (None)
Transport Area Working Group                                 H. Dai, Ed.
Internet-Draft                                                     B. Fu
Intended status: Informational                                    K. Tan
Expires: 14 January 2021                                          Huawei
                                                            13 July 2020

                  PFC-Free Low Delay Control Protocol
             draft-dai-tsvwg-pfc-free-congestion-control-00

Abstract

   Today, low-latency transport protocols like RDMA over Converged
   Ethernet (RoCE) can provide good delay and throughput performance in
   small and lightly loaded high-speed datacenter networks due to
   lossless transport based on priority-based flow control (PFC).
   However, PFC suffers from various issues from performance degradation
   and unreliability (e.g., deadlock), limiting the deployment of RoCE
   to only small scale clusters (~1000).

   This document presents LDCP, a new transport that scales loss-
   sensitive transports, e.g., RDMA, to entire data-centers containing
   tens of thousands machines, without dependency on PFC for
   losslessness, i.e., PFC-free.  LDCP develops a novel end-to-end
   congestion control scheme and achieves very low queue occupancy even
   under high network utilization or large traffic churns, resulting in
   almost no packet loss.  Meanwhile, LDCP allows a new flow to jump
   start at full speed at the very beginning and therefore minimizes the
   latency for short RPC-style transactions.  LDCP relies on only WRED
   and ECN, two widely supported features on switches, so it can be
   easily deployed in existing network infrastructures.  Finally, LDCP
   is simple by design and thus can be easily implemented by
   programmable or ASIC NICs.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

Dai, et al.              Expires 14 January 2021                [Page 1]
Internet-Draft     PFC-Free Low Delay Control Protocol         July 2020

   This Internet-Draft will expire on 14 January 2021.

Copyright Notice

   Copyright (c) 2020 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Simplified BSD License text
   as described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   3
   2.  LDCP algorithm  . . . . . . . . . . . . . . . . . . . . . . .   3
     2.1.  ECN . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.2.  Stable stage algorithm  . . . . . . . . . . . . . . . . .   4
     2.3.  Zero-RTT bandwidth acquisition  . . . . . . . . . . . . .   6
   3.  Reference Implementation  . . . . . . . . . . . . . . . . . .   8
   4.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   9
   6.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   9
     6.1.  Normative References  . . . . . . . . . . . . . . . . . .   9
     6.2.  Informative References  . . . . . . . . . . . . . . . . .  10
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  10

1.  Introduction

   Modern cloud applications, such as web search, social networking,
   real-time communication, and retail recommendation, require high
   throughput and low latency network to meet the increasing demands
   from customers.  Meanwhile, new trends in data-centers, like resource
   disaggregation, heterogeneous computing, block storage over NVMe,
   etc., continuously drive the need for high-speed networks.  Recently,
   high-speed networks, with 40Gbps to 100Gbps link speed, are deployed
   in many large data-centers.

   Conventional software TCP/IP stacks incur high latencies and
   substantial CPU overhead, and have limited applications from fully
Show full document text