Data Center TCP (DCTCP): TCP Congestion Control for Data Centers
RFC 8257
Revision differences
Document history
Date | Rev. | By | Action |
---|---|---|---|
2021-09-29
|
10 | (System) | Received changes through RFC Editor sync (added Errata tag) |
2017-10-17
|
10 | (System) | Received changes through RFC Editor sync (created alias RFC 8257, changed title to 'Data Center TCP (DCTCP): TCP Congestion Control for Data Centers', changed … Received changes through RFC Editor sync (created alias RFC 8257, changed title to 'Data Center TCP (DCTCP): TCP Congestion Control for Data Centers', changed abstract to 'This Informational RFC describes Data Center TCP (DCTCP): a TCP congestion control scheme for data-center traffic. DCTCP extends the Explicit Congestion Notification (ECN) processing to estimate the fraction of bytes that encounter congestion rather than simply detecting that some congestion has occurred. DCTCP then scales the TCP congestion window based on this estimate. This method achieves high-burst tolerance, low latency, and high throughput with shallow- buffered switches. This memo also discusses deployment issues related to the coexistence of DCTCP and conventional TCP, discusses the lack of a negotiating mechanism between sender and receiver, and presents some possible mitigations. This memo documents DCTCP as currently implemented by several major operating systems. DCTCP, as described in this specification, is applicable to deployments in controlled environments like data centers, but it must not be deployed over the public Internet without additional measures.', changed pages to 17, changed standardization level to Informational, changed state to RFC, added RFC published event at 2017-10-17, changed IESG state to RFC Published) |
2017-10-17
|
10 | (System) | RFC published |
2017-10-16
|
10 | (System) | RFC Editor state changed to AUTH48-DONE from AUTH48 |
2017-10-05
|
10 | (System) | RFC Editor state changed to AUTH48 from RFC-EDITOR |
2017-10-02
|
10 | (System) | RFC Editor state changed to RFC-EDITOR from EDIT |
2017-09-15
|
10 | (System) | IANA Action state changed to No IC from In Progress |
2017-09-15
|
10 | (System) | RFC Editor state changed to EDIT |
2017-09-15
|
10 | (System) | IESG state changed to RFC Ed Queue from Approved-announcement sent |
2017-09-15
|
10 | (System) | Announcement was received by RFC Editor |
2017-09-15
|
10 | (System) | IANA Action state changed to In Progress |
2017-09-15
|
10 | Amy Vezza | IESG state changed to Approved-announcement sent from Approved-announcement to be sent::Point Raised - writeup needed |
2017-09-15
|
10 | Amy Vezza | IESG has approved the document |
2017-09-15
|
10 | Amy Vezza | Closed "Approve" ballot |
2017-09-15
|
10 | Amy Vezza | Ballot approval text was generated |
2017-08-28
|
10 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-10.txt |
2017-08-28
|
10 | (System) | New version approved |
2017-08-28
|
10 | (System) | Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian |
2017-08-28
|
10 | Lars Eggert | Uploaded new revision |
2017-07-16
|
09 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-09.txt |
2017-07-16
|
09 | (System) | New version approved |
2017-07-16
|
09 | (System) | Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian |
2017-07-16
|
09 | Lars Eggert | Uploaded new revision |
2017-06-29
|
08 | Jean Mahoney | Request for Last Call review by GENART Completed: Ready with Nits. Reviewer: Orit Levin. |
2017-06-27
|
08 | (System) | IANA Review state changed to Version Changed - Review Needed from IANA OK - No Actions Needed |
2017-06-27
|
08 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-08.txt |
2017-06-27
|
08 | (System) | New version approved |
2017-06-27
|
08 | (System) | Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian |
2017-06-27
|
08 | Lars Eggert | Uploaded new revision |
2017-06-22
|
07 | Cindy Morgan | IESG state changed to Approved-announcement to be sent::Point Raised - writeup needed from Waiting for Writeup |
2017-06-22
|
07 | Benoît Claise | [Ballot comment] I have not seen any reply to Joe Clarke's OPS DIR review: Hello, WG and authors. I have reviewed rev -07 of the … [Ballot comment] I have not seen any reply to Joe Clarke's OPS DIR review: Hello, WG and authors. I have reviewed rev -07 of the draft-ietf-tcpm-dctcp as requested by the OPS-DIR. This review focuses on improving operational aspects as well as any nits found in the text. This document is an informational draft that describes Data Center TCP (DCTCP), a congestion control mechanism for TCP in Data Center environments. Overall, I believe this document to be ready, with some nits and perhaps small areas for improved clarity and readability. First, I'd like to say that I appreciate the fact that this has been implemented on a number of kernels, and the authors included real-world implementation results and thoughts. From an operational perspective, that is very helpful. I also appreciated the fact that there are interoperability challenges, and those were called out in the document. My specific comments are below. There are a lot of abbreviations, variables and other terminology used throughout this document. It might be helpful for the reader to have an expanded terminology section at the top that one can refer to for all of these things. Some of the abbreviations are called out in the description of the algorithm, but not all (e.g., DCTCP.Alpha, CWR, RTT, etc.). === Section 3.2: You refer to DCTCP.Alpha before defining it. While you refer to Section 3.3 here, the impact of an incorrect Alpha value is not fully appreciated in this text. Perhaps this could be changed to reflect the impact the incorrect Alpha value would have? === Section 3.2: My abbreviating DCTCP.CE as CE in your state machine diagram, it is a bit confusing as to the difference between CE and DCTCP.CE. The description of the state machine above requires the CE codepoint to have a certain value in order for DCTCP.CE to change. Perhaps you can use D.CE as an abbreviation to be a bit clearer here. === Section 3.3: It is not clear if 'g' can be inclusive of 0 and 1. === Section 3.3: You define DCTCP.WindowEnd as the threshold for beginning a new observation window, but maybe to complement the state variable name, you should define it as the following: The TCP sequence number threshold when one observation window ends and other is to begin; initialized to SND.UNA. === Section 3.3: You state: Thus, when no bytes sent experienced congestion, DCTCP.Alpha equals zero, and cwnd is left unchanged But if I use a value of 1/16 for g, with DCTCP.Alpha initialized to 1 as you say, I get a value of DCTCP.Alpha == 15/16 when there is no congestion (i.e., M == 0). === Section 3.5: You have an extra space here before the comma: If SYN , SYN-ACK and RST packets for DCTCP connections have ECT set This should be: If SYN, SYN-ACK and RST packets for DCTCP connections have ECT set === Section 3.5: You do not define ECT before using it. === Section 4.1: Can you provide a reference for NewReno? === Section 5: Can you reference or define AQM and RED? |
2017-06-22
|
07 | Benoît Claise | [Ballot Position Update] New position, No Objection, has been recorded for Benoit Claise |
2017-06-21
|
07 | Alia Atlas | [Ballot Position Update] New position, No Objection, has been recorded for Alia Atlas |
2017-06-21
|
07 | Kathleen Moriarty | [Ballot Position Update] New position, No Objection, has been recorded for Kathleen Moriarty |
2017-06-21
|
07 | Adam Roach | [Ballot comment] Given the nature of this mechanism, I would have expected some qualitative analysis of its performance under typical data center conditions, rather than … [Ballot comment] Given the nature of this mechanism, I would have expected some qualitative analysis of its performance under typical data center conditions, rather than the somewhat vague descriptions of it being an "improvement." If the cited literature contains such numbers, I would suggest (a) specifically citing where such data can be found; and (b) copying a very high-level summary into this document (e.g., something like: "Under typical data center load conditions, intra-center transfers of large (muti-gigabyte) files were improved by approximately 12% over Standard TCP using commodity switches in their default configuration. See [REFERENCE] for details.") Please expand the following acronyms upon first use; see https://www.rfc-editor.org/materials/abbrev.expansion.txt for guidance. - L3 - Level 3 - ECT - ECN-Capable Transport - DSCP - Differentiated Services Code Point - AQM - Active Queue Management - RED - Random Early Detection |
2017-06-21
|
07 | Adam Roach | [Ballot Position Update] New position, No Objection, has been recorded for Adam Roach |
2017-06-21
|
07 | Ben Campbell | [Ballot comment] Substantive Comments: - General: The purpose of this draft is not clear to me. Is the point to document the Microsoft implementation just … [Ballot comment] Substantive Comments: - General: The purpose of this draft is not clear to me. Is the point to document the Microsoft implementation just for people's information? Do you have hopes other people will implement this? As written, this seems like a case of an informational draft defining protocol. That's not necessarily a problem, but it's helpful to put a paragraph near the beginning to describe why this is being published and what expectations people have of the outcome. (If the answer is along the lines of "We'd like people to implement this so we can get more operational experience", then I will wonder why the status was not "experimental".) -1, last paragraph: I assume this means that all participants need to live in the datacenter, right? That is, no flows where only one end lives in the datacenter? (I think you clarify that later, but it would be helpful to state it here.) - 3.3: first paragraph: Why not MUST? -- "The congestion estimator on the sender SHOULD process acceptable ACKs as follows:" Why not MUST? Nits: - 1: Can you offer a citation for MapReduce? - 2: The additional text assumes the usage of 2119 keywords here do not quite map to the 2119 definitions. -- "but even compliant implementations without the measures in sections 4-6 would still only be safe to deploy in controlled environments.": That seems too important of a statement to be buried in the terminology section. - 4.1: Citation for NewReno? |
2017-06-21
|
07 | Ben Campbell | [Ballot Position Update] New position, No Objection, has been recorded for Ben Campbell |
2017-06-21
|
07 | Alissa Cooper | [Ballot Position Update] New position, No Objection, has been recorded for Alissa Cooper |
2017-06-21
|
07 | Alexey Melnikov | [Ballot Position Update] New position, No Objection, has been recorded for Alexey Melnikov |
2017-06-20
|
07 | Terry Manderson | [Ballot comment] Thank you for a well constructed document, and (IMHO) a nice approach to the issue. I noticed a few nits (some caught by … [Ballot comment] Thank you for a well constructed document, and (IMHO) a nice approach to the issue. I noticed a few nits (some caught by Alvaro) and others that are either me reading late at night or typographical concerns (such as "If SYN , SYN-ACK" [comma placement] first line of section 3.5 - so please give it a thorough read through) |
2017-06-20
|
07 | Terry Manderson | [Ballot Position Update] New position, Yes, has been recorded for Terry Manderson |
2017-06-20
|
07 | Spencer Dawkins | [Ballot comment] I think Alvaro's nits in his ballot are worth a look, but I'm really glad to see this work moving forward, and wanted … [Ballot comment] I think Alvaro's nits in his ballot are worth a look, but I'm really glad to see this work moving forward, and wanted to thank the authors for a clear explanation of a TCP mechanism that I think I could implement myself. |
2017-06-20
|
07 | Spencer Dawkins | [Ballot Position Update] New position, Yes, has been recorded for Spencer Dawkins |
2017-06-20
|
07 | Suresh Krishnan | [Ballot Position Update] New position, No Objection, has been recorded for Suresh Krishnan |
2017-06-20
|
07 | Alvaro Retana | [Ballot comment] Several nits: - The Abstract says that "This memo documents existing DCTCP implementations ([WINDOWS], [LINUX], [FREEBSD])..." But in reality it doesn't, it just … [Ballot comment] Several nits: - The Abstract says that "This memo documents existing DCTCP implementations ([WINDOWS], [LINUX], [FREEBSD])..." But in reality it doesn't, it just points to those references that presumably contain implementation information. The [WINDOWS] reference is only used in the Abstract -- last I looked, there shouldn't be references there [rfc7322]. - "...and deployment experience ([MORGANSTANLEY])." Again, this draft doesn't document deployment experience, just points at it. - rfc7942 recommends that the Implementation Status section be removed. If the intent is to keep it, then consider putting a note so that the RFC Editor doesn't remove it.. - The fact that this document describes the Microsoft Windows Server 2012 implementation should be made clear from the start (in the Introduction). You could then also get rid of the extra text in Section 2. - The reference to [RFC3168-ERRATA3639] seems strange to me...not because it is pointing to the report, but because it is Informative, when the reference to RFC3168 is Normative. I would assume that because Errata3639 has been Verified, then it means it is now "part of" RFC3168, so I would think that there's no need to mention it separately... |
2017-06-20
|
07 | Alvaro Retana | [Ballot Position Update] New position, No Objection, has been recorded for Alvaro Retana |
2017-06-20
|
07 | Deborah Brungard | [Ballot Position Update] New position, No Objection, has been recorded for Deborah Brungard |
2017-06-19
|
07 | Eric Rescorla | [Ballot Position Update] New position, No Objection, has been recorded for Eric Rescorla |
2017-06-15
|
07 | Tero Kivinen | Request for Last Call review by SECDIR Completed: Has Issues. Reviewer: Catherine Meadows. |
2017-06-15
|
07 | Mirja Kühlewind | Ballot has been issued |
2017-06-15
|
07 | Mirja Kühlewind | [Ballot Position Update] New position, Yes, has been recorded for Mirja Kühlewind |
2017-06-15
|
07 | Mirja Kühlewind | Created "Approve" ballot |
2017-06-15
|
07 | Mirja Kühlewind | Ballot writeup was changed |
2017-06-15
|
07 | Mirja Kühlewind | Changed consensus to Yes from Unknown |
2017-06-15
|
07 | (System) | IESG state changed to Waiting for Writeup from In Last Call |
2017-06-09
|
07 | (System) | IANA Review state changed to IANA OK - No Actions Needed from IANA - Review Needed |
2017-06-09
|
07 | Sabrina Tanamal | (Via drafts-lastcall@iana.org): IESG/Authors/WG Chairs: The IANA Services Operator has reviewed draft-ietf-tcpm-dctcp-07.txt, which is currently in Last Call, and has the following comments: We … (Via drafts-lastcall@iana.org): IESG/Authors/WG Chairs: The IANA Services Operator has reviewed draft-ietf-tcpm-dctcp-07.txt, which is currently in Last Call, and has the following comments: We understand that this document doesn't require any registry actions. While it's often helpful for a document's IANA Considerations section to remain in place upon publication even if there are no actions, if the authors strongly prefer to remove it, we do not object. If this assessment is not accurate, please respond as soon as possible. Thank you, Sabrina Tanamal IANA Services Specialist PTI |
2017-06-08
|
07 | Joe Clarke | Request for Last Call review by OPSDIR Completed: Has Nits. Reviewer: Joe Clarke. Sent review to list. |
2017-06-06
|
07 | Gunter Van de Velde | Request for Last Call review by OPSDIR is assigned to Joe Clarke |
2017-06-06
|
07 | Gunter Van de Velde | Request for Last Call review by OPSDIR is assigned to Joe Clarke |
2017-06-02
|
07 | Tero Kivinen | Request for Last Call review by SECDIR is assigned to Catherine Meadows |
2017-06-02
|
07 | Tero Kivinen | Request for Last Call review by SECDIR is assigned to Catherine Meadows |
2017-06-01
|
07 | Jean Mahoney | Request for Last Call review by GENART is assigned to Orit Levin |
2017-06-01
|
07 | Jean Mahoney | Request for Last Call review by GENART is assigned to Orit Levin |
2017-06-01
|
07 | Amy Vezza | IANA Review state changed to IANA - Review Needed |
2017-06-01
|
07 | Amy Vezza | The following Last Call announcement was sent out: From: The IESG To: IETF-Announce CC: tcpm@ietf.org, Michael Scharf , michael.scharf@nokia.com, draft-ietf-tcpm-dctcp@ietf.org, ietf@kuehlewind.net, … The following Last Call announcement was sent out: From: The IESG To: IETF-Announce CC: tcpm@ietf.org, Michael Scharf , michael.scharf@nokia.com, draft-ietf-tcpm-dctcp@ietf.org, ietf@kuehlewind.net, tcpm-chairs@ietf.org Reply-To: ietf@ietf.org Sender: Subject: Last Call: (Datacenter TCP (DCTCP): TCP Congestion Control for Datacenters) to Informational RFC The IESG has received a request from the TCP Maintenance and Minor Extensions WG (tcpm) to consider the following document: - 'Datacenter TCP (DCTCP): TCP Congestion Control for Datacenters' as Informational RFC The IESG plans to make a decision in the next few weeks, and solicits final comments on this action. Please send substantive comments to the ietf@ietf.org mailing lists by 2017-06-15. Exceptionally, comments may be sent to iesg@ietf.org instead. In either case, please retain the beginning of the Subject line to allow automated sorting. Abstract This informational memo describes Datacenter TCP (DCTCP), a TCP congestion control scheme for datacenter traffic. DCTCP extends the Explicit Congestion Notification (ECN) processing to estimate the fraction of bytes that encounter congestion, rather than simply detecting that some congestion has occurred. DCTCP then scales the TCP congestion window based on this estimate. This method achieves high burst tolerance, low latency, and high throughput with shallow- buffered switches. This memo also discusses deployment issues related to the coexistence of DCTCP and conventional TCP, the lack of a negotiating mechanism between sender and receiver, and presents some possible mitigations. This memo documents existing DCTCP implementations ([WINDOWS], [LINUX], [FREEBSD]) and deployment experience ([MORGANSTANLEY]). DCTCP as described in this draft is applicable to deployments in controlled environments like datacenters but it must not be deployed over the public Internet without additional measures, as detailed in Section 5. The file can be obtained via https://datatracker.ietf.org/doc/draft-ietf-tcpm-dctcp/ IESG discussion can be tracked via https://datatracker.ietf.org/doc/draft-ietf-tcpm-dctcp/ballot/ The following IPR Declarations may be related to this I-D: https://datatracker.ietf.org/ipr/2319/ |
2017-06-01
|
07 | Amy Vezza | IESG state changed to In Last Call from Last Call Requested |
2017-06-01
|
07 | Mirja Kühlewind | Placed on agenda for telechat - 2017-06-22 |
2017-06-01
|
07 | Mirja Kühlewind | Last call was requested |
2017-06-01
|
07 | Mirja Kühlewind | Ballot approval text was generated |
2017-06-01
|
07 | Mirja Kühlewind | Ballot writeup was generated |
2017-06-01
|
07 | Mirja Kühlewind | IESG state changed to Last Call Requested from Publication Requested |
2017-06-01
|
07 | Mirja Kühlewind | Last call announcement was generated |
2017-06-01
|
07 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-07.txt |
2017-06-01
|
07 | (System) | New version approved |
2017-06-01
|
07 | (System) | Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian |
2017-06-01
|
07 | Lars Eggert | Uploaded new revision |
2017-05-09
|
06 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-06.txt |
2017-05-09
|
06 | (System) | New version approved |
2017-05-09
|
06 | (System) | Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian |
2017-05-09
|
06 | Lars Eggert | Uploaded new revision |
2017-04-24
|
05 | Michael Scharf | 1. Summary The document shepherd is Michael Scharf . The responsible Area Director is Mirja Kuehlewind . This informational memo describes Datacenter TCP (DCTCP). DCTCP … 1. Summary The document shepherd is Michael Scharf . The responsible Area Director is Mirja Kuehlewind . This informational memo describes Datacenter TCP (DCTCP). DCTCP is an improvement to TCP congestion control for datacenter traffic that uses Explicit Congestion Notification (ECN). DCTCP as described in this draft is applicable to deployments in controlled environments like datacenters, but it must not be deployed over the public Internet without additional measures. The document is published to document an implementation in the Microsoft Windows Server 2012 operating system. The Linux and FreeBSD operating systems have also implemented support for DCTCP. Given the limitations of the existing DCTCP specification, which are discussed in the document, the TCPM working group requests publication as informational document. 2. Review and Consensus The objective of this informational memo is to document an alternative TCP congestion control algorithm that is known to be widely deployed. It is consensus in the TCPM working group that a DCTCP standard would require further work. A precise documentation of running code enables follow-up experimental or standards track RFCs. The document describes DCTCP as implemented in Microsoft Windows Server 2012. Since the publication of the first versions of the document, the Linux and FreeBSD operating systems have also implemented support for DCTCP. The specification should also enable implementation in other TCP stacks. The TCPM working group has reviewed the document regarding clarity and comprehensiveness of the protocol specification, e.g. in corner cases. The document has been discussed multiple times in the working group without any major controversy. During the working group last call there have been several detailed reviews, and those comments have been addressed in the most recent version. All in all, there is very strong consensus in the TCPM working group that this document should be published. 3. Intellectual Property Each author has stated that their direct, personal knowledge of any IPR related to this document has already been disclosed, in conformance with BCPs 78 and 79. There is an IPR disclosure for the DCTCP protocol specification (https://datatracker.ietf.org/ipr/2319/), which declares "Royalty-Free, Reasonable and Non-Discriminatory License to All Implementers". The TCPM working group is aware of this IPR but there have never been concerns. 4. Other Points None |
2017-04-24
|
05 | Michael Scharf | Responsible AD changed to Mirja Kühlewind |
2017-04-24
|
05 | Michael Scharf | IETF WG state changed to Submitted to IESG for Publication from WG Consensus: Waiting for Write-Up |
2017-04-24
|
05 | Michael Scharf | IESG state changed to Publication Requested |
2017-04-24
|
05 | Michael Scharf | IESG process started in state Publication Requested |
2017-04-20
|
05 | Michael Scharf | Changed document writeup |
2017-04-20
|
05 | Michael Scharf | IETF WG state changed to WG Consensus: Waiting for Write-Up from In WG Last Call |
2017-03-27
|
05 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-05.txt |
2017-03-27
|
05 | (System) | New version approved |
2017-03-27
|
05 | (System) | Request for posting confirmation emailed to previous authors: Stephen Bensley , Lars Eggert , Dave Thaler , Glenn Judd , Praveen Balasubramanian |
2017-03-27
|
05 | Lars Eggert | Uploaded new revision |
2017-02-15
|
04 | Michael Scharf | IETF WG state changed to In WG Last Call from WG Document |
2017-02-15
|
04 | Michael Scharf | Notification list changed to "Michael Scharf" <michael.scharf@nokia.com> |
2017-02-15
|
04 | Michael Scharf | Document shepherd changed to Michael Scharf |
2017-02-07
|
04 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-04.txt |
2017-02-07
|
04 | (System) | New version approved |
2017-02-07
|
04 | (System) | Request for posting confirmation emailed to previous authors: "Stephen Bensley" , "Praveen Balasubramanian" , "Dave Thaler" , "Glenn Judd" , "Lars Eggert" |
2017-02-07
|
04 | Lars Eggert | Uploaded new revision |
2016-11-13
|
03 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-03.txt |
2016-11-13
|
03 | (System) | New version approved |
2016-11-13
|
03 | (System) | Request for posting confirmation emailed to previous authors: "Stephen Bensley" , "Praveen Balasubramanian" , "Dave Thaler" , "Glenn Judd" , "Lars Eggert" |
2016-11-13
|
03 | Lars Eggert | Uploaded new revision |
2016-09-02
|
02 | Michael Scharf | Intended Status changed to Informational from None |
2016-07-17
|
02 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-02.txt |
2015-11-01
|
01 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-01.txt |
2015-09-22
|
00 | Michael Scharf | This document now replaces draft-bensley-tcpm-dctcp instead of None |
2015-09-22
|
00 | Lars Eggert | New version available: draft-ietf-tcpm-dctcp-00.txt |