Methods for Mitigation of Congestion and Load Issues on RADIUS Servers
draft-janfred-radext-radius-congestion-control-01
This document is an Internet-Draft (I-D).
Anyone may submit an I-D to the IETF.
This I-D is not endorsed by the IETF and has no formal standing in the
IETF standards process.
| Document | Type | Active Internet-Draft (individual) | |
|---|---|---|---|
| Author | Jan-Frederik Rieckers | ||
| Last updated | 2025-10-20 | ||
| RFC stream | (None) | ||
| Intended RFC status | (None) | ||
| Formats | |||
| Additional resources |
GitHub Repository
|
||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | I-D Exists | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
draft-janfred-radext-radius-congestion-control-01
RADIUS EXTensions J.-F. Rieckers
Internet-Draft DFN
Intended status: Experimental 20 October 2025
Expires: 23 April 2026
Methods for Mitigation of Congestion and Load Issues on RADIUS Servers
draft-janfred-radext-radius-congestion-control-01
Abstract
The RADIUS protocol as defined in [RFC2865] does not have a means to
signal server overload or congesition to the clients. This can lead
to load problems, especially in a federated RADIUS proxy fabric.
This document attempts to fix this.
About This Document
This note is to be removed before publishing as an RFC.
Status information for this document may be found at
https://datatracker.ietf.org/doc/draft-janfred-radext-radius-
congestion-control/.
Discussion of this document takes place on the RADIUS EXTensions
mailing list (mailto:radext@ietf.org), which is archived at
https://mailarchive.ietf.org/arch/browse/radext/. Subscribe at
https://www.ietf.org/mailman/listinfo/radext/.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 23 April 2026.
Rieckers Expires 23 April 2026 [Page 1]
Internet-Draft RADIUS Congestion Mitigations October 2025
Copyright Notice
Copyright (c) 2025 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Conventions and Definitions . . . . . . . . . . . . . . . . . 3
3. Protocol Description . . . . . . . . . . . . . . . . . . . . 4
3.1. Proxy-Capability Attribute . . . . . . . . . . . . . . . 4
3.2. Response-Delay Attribute . . . . . . . . . . . . . . . . 6
3.3. Request-Block Attribute . . . . . . . . . . . . . . . . . 6
3.3.1. Request-Block-Attribute and
Request-Block-Extended-Attribute . . . . . . . . . . 6
3.4. RADIUS Instance behavior . . . . . . . . . . . . . . . . 7
3.4.1. RADIUS proxy . . . . . . . . . . . . . . . . . . . . 7
3.4.2. Enforcing Instance . . . . . . . . . . . . . . . . . 8
4. Security Considerations . . . . . . . . . . . . . . . . . . . 10
5. Recommendations for Operators . . . . . . . . . . . . . . . . 10
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 11
7.1. Normative References . . . . . . . . . . . . . . . . . . 11
7.2. Informative References . . . . . . . . . . . . . . . . . 12
Appendix A. Document Status . . . . . . . . . . . . . . . . . . 12
A.1. Change History . . . . . . . . . . . . . . . . . . . . . 12
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 12
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 12
1. Introduction
The RADIUS protocol [RFC2865] does not have a means to signal a
server overload or a congestion to RADIUS clients. These overload
situations may be a result of a high load of legitimate traffic and
might even be worsened by retransmissions of packets the server
failed to answer due to the high load. These situation can happen in
a lost of scenarios. In RADIUS proxy fabric, a server overload may
even result from a single RADIUS client, for example when an EAP
supplicant immediately starts a new authentication try without delay
Rieckers Expires 23 April 2026 [Page 2]
Internet-Draft RADIUS Congestion Mitigations October 2025
when getting a reject.
Especially in RADIUS proxy fabrics, the impact of misbehaving clients
on the whole proxy chain can be reduced by reducing the packet load
at the entry level or as early in the proxy chain as possible. Since
the end user device cannot be controlled, we have to rely on the
RADIUS proxies to implement coutermeasures.
These countermeasures can be used to reduce the load by one of two
methods.
First, the response to requests can be delayed. By delaying RADIUS
responses, the client has to wait for the answer to send its next
request, which decreases the packet load on the server. This method
can also be used to slow down clients that immediately retry the
authentication once they receive a reject.
When a home server knows that an authentication of this client cannot
succeed (for example because it used an expired certificate with EAP-
TLS), and the client keeps retrying, any RADIUS actor along the proxy
chain could generate a reject for this specific user.
Pushing these countermeasures to the the earliest possible RADIUS
Instance inside the proxy chain has multiple advantages over
rejecting it at the home server. First, it reduces the load on all
proxies in the proxy chain, since they do not need to forward traffic
that will get rejected anyway. Secondly, when the response should
get delayed, pushing this delay as far down the proxy chain prevents
RADIUS retransmissions. When the RADIUS proxy already has the
response, it then does not need to proxy the retransmitted RADIUS
packets, which reduces the load for the RADIUS proxies in the later
proxy chain. Instead, the RADIUS proxy just ignores the
retransmission, since it already has an answer for this RADIUS
packet, but the answer is just delayed.
2. Conventions and Definitions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
Additionally, we use the following terms in this document, in the
meaning as defined here:
RADIUS Instance A single device or software module that implements
the RADIUS protocol
Rieckers Expires 23 April 2026 [Page 3]
Internet-Draft RADIUS Congestion Mitigations October 2025
RADIUS Client A RADIUS Instance that sends RADIUS requests and
receives RADIUS responses. This is only in reference to a single
RADIUS hop.
RADIUS Server A RADIUS Instance that receives RADIUS requests and
sends RADIUS responses. Similar to the RADIUS Client, this is
only in reference to a single RADIUS hop.
RADIUS Proxy A single device or software module that acts as RADIUS
server and RADIUS client at the same time. It receives RADIUS
requests and forwards them towards the next RADIUS proxy, usually
based on the realm of the User-Name attribute.
RADIUS Proxy Chain the list of RADIUS Instances that a RADIUS
Request traverses from the first RADIUS Client across any number
of RADIUS proxies to the final RADIUS Server that responds to the
RADIUS Request. When referring to the RADIUS Request, the chain
starts from the first RADIUS client sending the RADIUS Request and
ends at the last RADIUS Server. For the RADIUS Response, the
chain is reversed. The terms "earlier" and "later" will always be
used together with a reference to a request or a response. As
example: a RADIUS proxy earlier in the chain for a request is
located later in the chain for the response.
Enforcing Instance The RADIUS Instance that enforces the response
delay or the request blocking.
3. Protocol Description
The protocol extension consists of two parts: First, any RADIUS proxy
in the proxy chain capable of either of the two countermeasures needs
to signal this capability to the following RADIUS proxies and the
home server, so they know whether or not they can use this feature.
Second, for the reply, the home server or RADIUS proxy needs to
signal the reply policy back to the previous RADIUS proxies.
3.1. Proxy-Capability Attribute
The Proxy-Capability Attribute is used to signal the capability of a
RADIUS proxy to any RADIUS entity in the later proxy chain. With the
help of this, on the reply path, a RADIUS proxy can determine whether
the requested action should be performed by itself or the packet will
pass through another capable proxy later which can then perform the
actions.
Rieckers Expires 23 April 2026 [Page 4]
Internet-Draft RADIUS Congestion Mitigations October 2025
The Proxy-Capability Attribute is of type string as defined in
[RFC8044], Section 3.5. The value of the Proxy-Capability Attribute
is a concatenated list of the proxy capabilities the RADIUS Instance
has.
Correct formal description: TODO
Informal description: concatenate all capabilities. values up to 127
are encoded in one byte, extended capabilities are encoded as two
bytes. For parsing, the receiver can look at the first bit, if it is
a 0 it is a single-byte value, if it is a 1, then the capability is a
two-byte value. This allows for simple extension, while keeping it
as simple and short as possible. The attribute MUST NOT include a
capability multiple times.
Each capable RADIUS Instance in the RADIUS Proxy Chain SHOULD add the
Proxy-Capability Attribute for Access-Request and Accounting-Request
packets before forwarding the RADIUS packet to the next RADIUS
instance. Future capabilityes MAY specify capabilities for other
RADIUS packet types. The capabilities defined in this document
SHOULD only be added for Access-Request and Accounting-Request
packets when Proxy-Capabilitiy is used with other RADIUS packets.
When a capable RADIUS proxy receives a RADIUS packet with the Proxy-
Capability Attribute, the RADIUS Proxy SHOULD add its own
capabilities to the Attribute if the capability is not yet included.
The RADIUS Proxy MUST NOT remove existing capabilities, unless
explicitly configured to remove them. As a hint, administrators
SHOULD only configure the removal of capabilities when they know that
the capability is not honored.
In this document, we define two capabilities:
Capability Response-Delay-Capable The Capability Response-Delay-
Capable with value 1 is used to signal that the RADIUS Instance is
capable of delaying RADIUS Response packets.
Capability Request-Block-Capable The capability Request-Block-
Capable with value 2 is used to signal that the RADIUS Instance is
capable of blocking RADIUS Requests that match specific criteria
and sending an Access-Reject instead on behalf of the home server.
Rieckers Expires 23 April 2026 [Page 5]
Internet-Draft RADIUS Congestion Mitigations October 2025
3.2. Response-Delay Attribute
The Response-Delay Attribute is used to signal the desire of the home
server that sending of the RADIUS response should be delayed for a
certain amount. The Response-Delay Attribute is of type integer as
defined in [RFC8044], Section 3.1. The value is the delay in
milliseconds.
3.3. Request-Block Attribute
The Request-Block Attribute is used to signal the desire of the home
server that future requests that match certain criteria should be
rejected by a RADIUS Instance on behalf of the home server. The
Request-Block Attribute is of type tlv as defined in [RFC8044],
Section 3.13. The sub-attributes are defined as follows:
Request-Block-Period This sub-attribute contains the time span in
seconds during which requests matching the description should be
rejected. The attribute is of type integer as defined in
[RFC8044], Section 3.1.
Request-Block-Attributes This sub-attribute contains a list of
attribute types that should be used to match authentication
requests of the same user. The attribute is of type string as
defined in [RFC8044], Section 3.5 and contains a concatenated list
of one byte attribute types.
Request-Block-Extended-Attribute This sub-attribute contains a
reference to a single extended attribute that should be included
in the match. The attribute is of type string as defined in
[RFC8044], Section 3.5.
3.3.1. Request-Block-Attribute and Request-Block-Extended-Attribute
RADIUS attributes are formatted as Type-Length-Value with a fixed
one-byte type field. Since this allows for only 256 attributes,
extended attributes have been defined in [RFC6929]. Additionally,
RADIUS has vendor-specific attributes ([RFC2865], Section 5.26 or
[RFC6929], Section 2.4, the header of which may not be known to all
implementations. To still allow to filter on individual extended or
vendor-specific attributes which might be unknown to the Enforcing
Instance, we need a means to reference these attributes.
The Request-Block-Attributes sub-attribute is used to reference the
"primitive" RADIUS attributes, that is RADIUS attributes that only
have the one-byte attribute type. Since every of these types have a
one-byte-length, we can reduce the overhead by concatenating the
attribute types, as they are all one byte.
Rieckers Expires 23 April 2026 [Page 6]
Internet-Draft RADIUS Congestion Mitigations October 2025
For the extended or vendor-specific attributes this is not as easy,
since the length of the header may vary between the different
attributes. Therefore, we have the Request-Block-Extended-Attribute
sub-attribute, which references a single attribute that should be
included in the blocklist. This sub-attribute can be included
multiple times in the Request-Block attribute.
3.4. RADIUS Instance behavior
TODO - mostly stub or not complete specification, general description
of the behavior for every involved party
3.4.1. RADIUS proxy
Any RADIUS Instance capable of delaying or blocking SHOULD add the
Proxy-Capability attribute to every RADIUS Access-Request they send
to a RADIUS server. If a RADIUS Instance receives a RADIUS request
with this attribute, it SHOULD add its own capability, if not present
already, to the proxied RADIUS packet and SHOULD NOT remove any other
capability.
Upon reception a RADIUS Proxy needs to decide if it is the Enforcing
Instance or not, by looking at the original request. This decision
has to be done individually for the Response-Delay and the Request-
Block policy.
If the received RADIUS response contains a Response-Delay attribute
and the original request contained the Response-Delay capability, the
RADIUS Proxy SHOULD NOT enforce the policy and instead forward the
RADIUS response to the RADIUS client. If the original request did
not contain the Response-Delay capability, the RADIUS Proxy MUST
become the Enforcing Instance for the Response-Delay.
The algorithm for the Request-Block functionality is basically
similar, but needs additional considerations in regards to present
attributes. If the received RADIUS response contains a Request-Block
attribute and the original request did not contain the Request-Block
capability, the RADIUS Proxy MUST become the Enforcing Instance for
the Request-Block. Otherwise, the RADIUS Proxy MUST check the
presence of all attributes referenced in the Request-Block, whether
or not they are present in the original RADIUS request. If an
attribute was not present in the RADIUS request, the RADIUS Proxy
MUST check if the missing attribute was added by the RADIUS Proxy and
it is present in RADIUS request the RADIUS proxy sent to the next hop
RADIUS server. In this case the RADIUS Proxy MUST become the
Enforcing Instance, since any RADIUS Instances after the current
RADIUS Instance cannot enforce the requested blocking, as at least
one of the attributes is missing. If the missing attribute was not
Rieckers Expires 23 April 2026 [Page 7]
Internet-Draft RADIUS Congestion Mitigations October 2025
added by the RADIUS Proxy, the RADIUS Proxy SHOULD remove the
Request-Block attribute before forwarding the packet to the next
RADIUS Client. In this case, the missing attribute was added by a
RADIUS Instance not capable of Response-Block and positioned between
the current and a later Response-Block capable RADIUS Instance in the
RADIUS Proxy Chain of the RADIUS Request. Since we have no RADIUS-
native method to signal this condition back, the best approach to
deal with this is to ignore the block request. By removing the
Request-Block attribute from the response, we reduce the load on
later RADIUS Instances on the RAIDUS Proxy Chain for the RADIUS
Response, since they do not need to perform the attribute checks.
This removes the Response-Block functionality completely without
signalling back to the capable RADIUS Instances earlier in the
Response Proxy Chain. There is no easy solution to this problem, but
this is considered the best solution compared to complicated
signalling mechanisms that would only be beneficial in a limited
number of use cases and increase the complexity of implementations.
We therefore rely on the RADIUS server not to request a block based
on attributes that may have been added during the proxy chain. See
Section 5 for further discussion.
3.4.2. Enforcing Instance
An Enforcing Instance is in charge of performing the requested
action.
In general, an Enforcing Instance MUST remove the corresponding
RADIUS Attribute with the request to delay or block from the RADIUS
response before forwarding it to the next RADIUS Client.
3.4.2.1. Response-Delay
An Enforcing Instance for the Response-Delay MUST delay the response
it received from the RADIUS server before forwarding the RADIUS
response to the next RADIUS Client. If the Enforcing Instance is not
a RADIUS Proxy, any action that would normally follow the reception
of the RADIUS response MUST be delayed, i.e. the Enforcing Instance
acts as if the RADIUS response has not been received until the delay
timespan has passed.
An Enforcing Instance MUST NOT retransmit the RADIUS Request again to
the next hop RADIUS Server if it already received a RADIUS response
with the Response-Delay attribute. If the Enforcing Instance is a
RADIUS Proxy and the RADIUS Client retransmits the RADIUS request,
the Enforcing Instance MUST silently discard the retransmission.
This only applies for Enforcing Instances, any other RADIUS Proxy
will still follow its normal retransmission policy.
Rieckers Expires 23 April 2026 [Page 8]
Internet-Draft RADIUS Congestion Mitigations October 2025
The Enforcing Instance SHOULD delay the RADIUS response according to
the time span in the Response-Delay attributes, however the Enforcing
Instance MAY have an upper limit for the delaying response timespan.
By default, this upper limit SHOULD not be less than 10000
milliseconds (10 seconds) and it SHOULD be configurable by the
administrator.
// TODO: Reasoning behind the 10 sec delay: A common timeout limit
// for "response is missing, stop retrying" is 30 seconds, so by
// keeping the default upper limit below this we ensure that the
// response gets to the client, but it is delayed. We want to have
// the time configurable, because delaying responses uses up
// resources on the server. I delibirately didn't include text that
// the response should be sent if the ID space is exhausted, because
// the ID exhaustion may be the reason for the response delay further
// down the proxy chain, so in order to prevent impact on the later
// proxy chain, we need to shift the problem as far to the beginning
// of the proxy chain as possible.
//
// -- Janfred
// TODO: Maybe we should define a lower limit for the upper-limit
// config, i.e. the upper limit must not be less than 500
// milliseconds.
//
// -- Janfred
3.4.2.2. Request-Block
An Enforcing Instance for the Request-Block MUST reject the RADIUS
requests with certain attributes.
In order to avoid servers from blocking legitimate traffic by setting
the block-filter to arbitrary values, the Request-Block is always
dependent on the attributes of the original RADIUS requests.
TODO: From here only stub.
General algorithm: * create a list of attributes * attributes that
appear multiple times in the request must be included multiple times
in the list as well * for extended attributes: add every attribute to
the list with the prefix in the request-block-extended-attribute (but
skip over the length byte in the attribute in the request) * When a
request is received with attributes matching the list (every
attributes must be present and match): send an Access-Reject with an
Error-Cause attribute set to value TBD4 (Request ratelimited)
Rieckers Expires 23 April 2026 [Page 9]
Internet-Draft RADIUS Congestion Mitigations October 2025
The considerations for upper limit should probably also apply,
similar to the response-delay, but with way higher defaults
Maybe add functionality that the block is automatically timed out
after a time when no login has be observed (to free up space), but
"reset" the counter if observed again.
4. Security Considerations
TODO Security
5. Recommendations for Operators
TODO.
Elements to consider: * When should be delayed? * nearing ID-
Exhaustion due to many ongoing EAP sessions, add small delays * on
high server load add small delay * add a delay for a reject.
(FreeRADIUS has the option already, let's push this to the edge
instead) * When should the home server request a block for how long?
* outer username in EAP wrong - probably hours * inner username in
EAP does not exist (and has several failed attempts shortly) - few
minutes (outer username may stay the same if user changes only inner
username) * username points to someone that is no longer eligible for
access - probably hours * When should a proxy request a block? *
repeated requests immediately after a reject with impact on the
network - seconds to minutes * realm in username is unroutable -
minutes to hours * What are good and bad choices for attributes? *
good * User-Name. Always. * Calling-Station-ID * Called-Station-ID
(i.e. only block phone on this specific access point) * NAS-
Identifier, NAS-IPAddr, NAS-IPv6Addr (only block from this specific
NAS, helpful in roaming scenarios) * Operator-Name * bad * Proxy-
State (added by proxies that might not understand it. MUST NOT be
used) * Any other proxy-specific attribute * Only User-Name (may be
too broad in case of anonymous identities)
6. IANA Considerations
This document will have IANA actions.
They are still TODO in detail.
Roughly the following things should be allocated:
* Attribute Type (possibly from extended attributes) for Proxy-
Capability of type string (Extended-Attribute-1, TBD1)
* New registry table for for types in the Proxy-Capability attribute
Rieckers Expires 23 April 2026 [Page 10]
Internet-Draft RADIUS Congestion Mitigations October 2025
- 0 - reserved
- 1 - Response-Delay-Capable
- 2 - Request-Block-Capable
- 3-125 - reserved for future use
- 126 , 127 - experimental
- 128 - 255 - Extended Capability
* Attribute Type for Response-Delay of type integer (Extended-
Attribute-1, TBD2)
* Attribute Type for Request-Block of type tlv (Extended-Attribute-
1, TBD3)
* New registry table for types in the Response-Delay attributes
- 0 - reserved
- 1 - Request-Block-Period, Type integer, request to stop sending
data for this particular user for period of time, time in
seconds
- 2 - Request-Block-Attributes, type string
- 3 - Request-Block-Extended-Attribute, type string
- 4-250 - reserved for future use
- 251 - private use
- 252-255 - experimental
* New entry in the registry for Values for RADIUS Attribute 101,
Error-Cause Attribute
- 4XX (TBD4) with description "Request ratelimited"
7. References
7.1. Normative References
Rieckers Expires 23 April 2026 [Page 11]
Internet-Draft RADIUS Congestion Mitigations October 2025
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/rfc/rfc2119>.
[RFC2865] Rigney, C., Willens, S., Rubens, A., and W. Simpson,
"Remote Authentication Dial In User Service (RADIUS)",
RFC 2865, DOI 10.17487/RFC2865, June 2000,
<https://www.rfc-editor.org/rfc/rfc2865>.
[RFC8044] DeKok, A., "Data Types in RADIUS", RFC 8044,
DOI 10.17487/RFC8044, January 2017,
<https://www.rfc-editor.org/rfc/rfc8044>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
7.2. Informative References
[RFC6929] DeKok, A. and A. Lior, "Remote Authentication Dial In User
Service (RADIUS) Protocol Extensions", RFC 6929,
DOI 10.17487/RFC6929, April 2013,
<https://www.rfc-editor.org/rfc/rfc6929>.
Appendix A. Document Status
Note to RFC Editor: Remove this section before publication
A.1. Change History
draft-janfred-radext-radius-congestion-control-01:
- no significant changes
draft-janfred-radext-radius-congestion-control-00:
- Initial draft version
Acknowledgments
TODO acknowledge.
Author's Address
Rieckers Expires 23 April 2026 [Page 12]
Internet-Draft RADIUS Congestion Mitigations October 2025
Jan-Frederik Rieckers
Deutsches Forschungsnetz | German National Research and Education Network
Alexanderplatz 1
10178 Berlin
Germany
Email: rieckers@dfn.de
URI: www.dfn.de
Rieckers Expires 23 April 2026 [Page 13]