Skip to main content

Configuring UDP Sockets for ECN for Common Platforms
draft-duke-tsvwg-udp-ecn-01

Document Type Active Internet-Draft (individual)
Author Martin Duke
Last updated 2024-08-28
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-duke-tsvwg-udp-ecn-01
Transport and Services Working Group                             M. Duke
Internet-Draft                                                    Google
Intended status: Informational                            28 August 2024
Expires: 1 March 2025

          Configuring UDP Sockets for ECN for Common Platforms
                      draft-duke-tsvwg-udp-ecn-01

Abstract

   Explicit Congestion Notification (ECN) applies to all transport
   protocols in principle.  However, it had limited applications for UDP
   until QUIC became widely deployed.  As a result, documentation of UDP
   socket APIs for ECN on various platforms is sparse.  This document
   records the results of experimenting with these APIs in order to get
   ECN working on UDP for Chromium on Apple, Linux, and Windows
   platforms.

About This Document

   This note is to be removed before publishing as an RFC.

   The latest revision of this draft can be found at
   https://martinduke.github.io/udp-ecn/draft-duke-tsvwg-udp-ecn.html.
   Status information for this document may be found at
   https://datatracker.ietf.org/doc/draft-duke-tsvwg-udp-ecn/.

   Discussion of this document takes place on the Transport and Services
   Working Group Working Group mailing list (mailto:tsvwg@ietf.org),
   which is archived at https://mailarchive.ietf.org/arch/browse/tsvwg/.
   Subscribe at https://www.ietf.org/mailman/listinfo/tsvwg/.

   Source for this draft and an issue tracker can be found at
   https://github.com/martinduke/udp-ecn.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

Duke                      Expires 1 March 2025                  [Page 1]
Internet-Draft                   udp-ecn                     August 2024

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 1 March 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Conventions and Definitions . . . . . . . . . . . . . . . . .   3
   3.  Receiving ECN marks . . . . . . . . . . . . . . . . . . . . .   4
     3.1.  Setting the socket to report incoming ECN marks . . . . .   4
       3.1.1.  Linux and Apple . . . . . . . . . . . . . . . . . . .   4
       3.1.2.  Windows . . . . . . . . . . . . . . . . . . . . . . .   4
     3.2.  Retrieving ECN marks on incoming packets  . . . . . . . .   5
       3.2.1.  Linux . . . . . . . . . . . . . . . . . . . . . . . .   5
       3.2.2.  Apple . . . . . . . . . . . . . . . . . . . . . . . .   5
       3.2.3.  Windows . . . . . . . . . . . . . . . . . . . . . . .   6
   4.  Sending ECN marks . . . . . . . . . . . . . . . . . . . . . .   6
     4.1.  On a per-socket basis . . . . . . . . . . . . . . . . . .   6
       4.1.1.  Linux and Apple . . . . . . . . . . . . . . . . . . .   6
       4.1.2.  Windows . . . . . . . . . . . . . . . . . . . . . . .   7
     4.2.  On a per-packet basis . . . . . . . . . . . . . . . . . .   7
       4.2.1.  Linux and Apple . . . . . . . . . . . . . . . . . . .   7
       4.2.2.  Microsoft . . . . . . . . . . . . . . . . . . . . . .   7
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   7
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .   7
     7.2.  Informative References  . . . . . . . . . . . . . . . . .   8
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .   9
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   9

Duke                      Expires 1 March 2025                  [Page 2]
Internet-Draft                   udp-ecn                     August 2024

1.  Introduction

   [RFC3168] reserves two bits in the IP header for Explicit Congestion
   Notification (ECN), which provides network feedback to endpoint
   congestion controllers.  This has historically mostly been relevant
   to TCP ([RFC9293]), where any incoming ECN marks are internally
   consumed by the kernel, and therefore imply no application interface
   except enabling and disabling the capability.

   The Stream Control Transport Protocol (SCTP) ([RFC9260]) has long
   supported ECN in its design.  SCTP is sometimes carried over DTLS and
   UDP ([RFC8261]).  In principle, user-space implementers might have
   leveraged UDP ECN APIs to deliver ECN markings between SCTP and the
   UDP socket.  The author is not aware of any such efforts.

   [RFC6679] defines ECN over RTP over UDP.  The author is aware of a
   research implementation, but cannot confirm any commercial
   deployments.

   However, QUIC [RFC9000] runs over UDP and has seen wider deployment
   than SCTP.  The Low Latency, Low Loss, Scalable Throughput (L4S)
   experiment ([RFC9330]) and QUIC have combined to increase interest in
   ECN over UDP.

   The Chromium Projects ([CHROMIUM]) provide a widely-deployed protocol
   library that includes QUIC.  An effort to provide ECN support for
   QUIC on the many platforms on which Chromium is deployed revealed
   that many ECN-related UDP socket interfaces are poorly documented.

   This document provides a record of that experience, to encourage
   further support for ECN in other QUIC implementations, and indeed any
   consumer of ECN markings that operates over UDP.

2.  Conventions and Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

   This document is not a general tutorial on UDP socket programming,
   and assumes familiarity with basic socket concepts like binding,
   socket options, and common system error codes.

Duke                      Expires 1 March 2025                  [Page 3]
Internet-Draft                   udp-ecn                     August 2024

3.  Receiving ECN marks

   Network devices can change the ECN bits in the IP header.  Since this
   feedback is required at the packet sender, the packet receiver needs
   to extract this codepoint from the UDP socket in order to report to
   the sender.

   There are two components to this: setting the socket to report
   incoming ECN marks, and retrieving the value for each incoming
   packet.

3.1.  Setting the socket to report incoming ECN marks

3.1.1.  Linux and Apple

   To report ECN, applications set a socket option to true using a
   setsockopt() call.

   IPv6 sockets require a socket option of level IPPROTO_IPV6 and name
   IPV6_RECVTCLASS.

   IPv4 sockets require a socket option of level IPPROTO_IP and name
   IP_RECVTOS.

   For dual-stack sockets, on Linux hosts the application sets both the
   IPV6_RECVTCLASS and IP_RECVTOS options to receive ECN markings on all
   incoming packets.  On Apple hosts, the application only sets
   IPV6_RECVTCLASS; setting IP_RECVTOS will return an error.

   At the time of writing, an example implementation can be found at
   [CHROMIUM-POSIX].

3.1.2.  Windows

   Windows documentation recommends using the function WSASetRecvIPEcn()
   to enable ECN reporting regardless of the IP version.

   However, this can also be accomplished by calling setsockopt() and
   using options of level IPPROTO_IP and name IP_RECVECN for IPv4, and
   IPPROTO_IPV6 and IPV6_RECVECN for IPv6.  The author was unable to
   identify any online documentation of these options at the time of
   writing.

   For dual-stack sockets, WSASetRecvIPEcn() will not enable ECN
   reporting for IPv4.  This requires a separate setsockopt() call using
   the IP_RECVECN option.

Duke                      Expires 1 March 2025                  [Page 4]
Internet-Draft                   udp-ecn                     August 2024

   If a socket is bound to a IPv6-mapped IPv4 address (i.e. it is of the
   format ::ffff:<IPv4 address>), calls to WSASetRecvIpEcn() return
   error EINVAL.  These sockets should instead use an explicit
   setsockopt() call to set IP_RECVECN.

   At the time of writing, an example implementation can be found at
   [CHROMIUM-WINDOWS].

3.2.  Retrieving ECN marks on incoming packets

   All platforms described in this document require the use of a
   recvmsg() call to read data from the socket to retrieve ECN
   information, because that information is encoded in the control data
   that is returned from that function.  Those platforms all return zero
   or more "cmsg" that contain requested information about the arriving
   packet.

   Examples of the technique described below can be found at
   [CHROMIUM-POSIX] and [CHROMIUM-WINDOWS].

3.2.1.  Linux

   If the incoming packet is IPv4, Linux will include a cmsg of level
   IPPROTO_IP and type IP_TOS.

   If the incoming packet is IPv6, Linux will include a cmsg of level
   IPPROTO_IPV6 and type IP_TCLASS.

   The resulting byte of data is the entire Type-of-Service byte from
   the IP header.  The ECN mark constitutes the two least-significant
   bits of this byte.

3.2.2.  Apple

   If a UDP message (UDP/IPv4) is received on an IPv4 socket, the
   ancillary data will contain a cmsg of level IPPROTO_IP and type
   IP_RECVTOS.  The cmsg data contains an unsigned char.

   If a UDP message (UDP/IPv6 or UDP/IPv4) is received on an IPv6
   socket, the ancillary data will contain a cmsg or level IPPROTO_IPV6
   and type IP_RECVTCLASS.  The cmsg data contains an int.

   The provided data is the entire Type-of-Service (TOS) byte from the
   IPv4 header.  The ECN mark constitutes the two least-significant bits
   of this byte.

Duke                      Expires 1 March 2025                  [Page 5]
Internet-Draft                   udp-ecn                     August 2024

3.2.3.  Windows

   If the incoming packet is IPv4, the socket will include a cmsg of
   level IPPROTO_IP and type IP_ECN.

   If the incoming packet is IPv6, the socket will include a cmsg of
   level IPPROTO_IPV6 and type IPV6_ECN.

   The resulting integer solely consists of the ECN mark, and requires
   no further bitwise operations.

4.  Sending ECN marks

   Existing ECN specifications envision a particular connection
   consistently sending the same ECN marking.  It might transition that
   marking after successfully completing a handshake, recognizing the
   path or the peer do not support ECN, or transitioning to a new path.
   Therefore, using a socket option to configure a consistent marking is
   generally more resource-efficient.

   However, some server designs receive all incoming packets on a single
   socket.  As the many connections that constitute this packet stream
   may have different support for ECN, it is suitable to configure
   outgoing ECN on a per-packet basis.

4.1.  On a per-socket basis

4.1.1.  Linux and Apple

   Both Linux and Apple platforms set the outgoing ECN for IPv4 packets
   with a socket option of level IPPROTO_IP and name IP_TOS.

   For IPv6 packets, they use level IPPROTO_IPV6 and name IPV6_TCLASS.

   This setsockopt() call also sets the Differentiated Services Code
   Point (DSCP) bits that make up the rest of the TOS byte.
   Applications making this call will generally want to preserve any
   existing DSCP setting, which might require a getsockopt() call.

   For dual-stack sockets, we hypothesize that Linux sockets will
   require an additional setsockopt() call with IP_TOS.  Apple sockets
   will not and will return an error if this call is made.  Our
   experiments did not test this hypothesis.

   An example of the technique described above can be found at
   [CHROMIUM-POSIX].

Duke                      Expires 1 March 2025                  [Page 6]
Internet-Draft                   udp-ecn                     August 2024

4.1.2.  Windows

   The author did not experiment with setting a windows socket to send
   an ECN mark.

4.2.  On a per-packet basis

   Packets can be individually marked with ECN codepoints using the
   control information that accompanies a sendmsg() call.

4.2.1.  Linux and Apple

   These platforms expect a cmsg with level IPPROTO_IP and type IP_TOS
   if the destination is an IPv4 address, or a IPv4-mapped IPv6 address.

   Otherwise, they expect a cmsg with level IPPROTO_IPV6 and type
   IPV6_TCLASS.

4.2.2.  Microsoft

   Windows uses a cmsg with level IPPROTO_IP and type IP_ECN for IPv4
   packets.

   Windows uses a cmsg with level IPPROTO_IPV6 and type IPV6_ECN for
   IPv6 packets.

   An example of the technique described above can be found at
   [CHROMIUM-WINDOWS].

5.  Security Considerations

   The security implications of ECN are documented in [RFC3168] and
   [RFC9330].  This document is a guide to enabling these capabilities,
   which incurs no additional security considerations.

   Note that implementing ECN capabilities on some platforms, but not
   others, can help to fingerprint the operating system in use by a
   host, which can have privacy implications.  This document aims to
   mitigate that possibility.

6.  IANA Considerations

   This document has no IANA actions.

7.  References

7.1.  Normative References

Duke                      Expires 1 March 2025                  [Page 7]
Internet-Draft                   udp-ecn                     August 2024

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

7.2.  Informative References

   [CHROMIUM] "The Chromium Projects", n.d.,
              <https://www.chromium.org/chromium-projects/>.

   [CHROMIUM-POSIX]
              "udp_socket_posix.cc", n.d.,
              <https://source.chromium.org/chromium/chromium/
              src/+/main:net/socket/udp_socket_posix.cc>.

   [CHROMIUM-WINDOWS]
              "udp_socket_win.cc", n.d.,
              <https://source.chromium.org/chromium/chromium/
              src/+/main:net/socket/udp_socket_win.cc>.

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP",
              RFC 3168, DOI 10.17487/RFC3168, September 2001,
              <https://www.rfc-editor.org/info/rfc3168>.

   [RFC9293]  Eddy, W., Ed., "Transmission Control Protocol (TCP)",
              STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022,
              <https://www.rfc-editor.org/info/rfc9293>.

   [RFC9260]  Stewart, R., Tüxen, M., and K. Nielsen, "Stream Control
              Transmission Protocol", RFC 9260, DOI 10.17487/RFC9260,
              June 2022, <https://www.rfc-editor.org/info/rfc9260>.

   [RFC8261]  Tuexen, M., Stewart, R., Jesup, R., and S. Loreto,
              "Datagram Transport Layer Security (DTLS) Encapsulation of
              SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November
              2017, <https://www.rfc-editor.org/info/rfc8261>.

   [RFC6679]  Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P.,
              and K. Carlberg, "Explicit Congestion Notification (ECN)
              for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August
              2012, <https://www.rfc-editor.org/info/rfc6679>.

Duke                      Expires 1 March 2025                  [Page 8]
Internet-Draft                   udp-ecn                     August 2024

   [RFC9000]  Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
              Multiplexed and Secure Transport", RFC 9000,
              DOI 10.17487/RFC9000, May 2021,
              <https://www.rfc-editor.org/info/rfc9000>.

   [RFC9330]  Briscoe, B., Ed., De Schepper, K., Bagnulo, M., and G.
              White, "Low Latency, Low Loss, and Scalable Throughput
              (L4S) Internet Service: Architecture", RFC 9330,
              DOI 10.17487/RFC9330, January 2023,
              <https://www.rfc-editor.org/info/rfc9330>.

Acknowledgments

   The author would like to thank Ryan Hamilton, who provided constant
   advice through this effort.  Randall Meyer from Apple and Nick Grifka
   from Microsoft provided useful hints about the behavior of their
   respective operating systems.

   Will Hawkins, Max Inden, Colin Perkins, and Michael Tuexen made
   improvements to this draft.

Author's Address

   Martin Duke
   Google
   Email: martin.h.duke@gmail.com

Duke                      Expires 1 March 2025                  [Page 9]