Use Cases for Authentication of Web Bots
draft-nottingham-webbotauth-use-cases-01
This document is an Internet-Draft (I-D).
Anyone may submit an I-D to the IETF.
This I-D is not endorsed by the IETF and has no formal standing in the
IETF standards process.
| Document | Type | Active Internet-Draft (individual) | |
|---|---|---|---|
| Author | Mark Nottingham | ||
| Last updated | 2026-01-18 | ||
| RFC stream | (None) | ||
| Intended RFC status | (None) | ||
| Formats | |||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | I-D Exists | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
draft-nottingham-webbotauth-use-cases-01
Network Working Group M. Nottingham
Internet-Draft 18 January 2026
Intended status: Standards Track
Expires: 22 July 2026
Use Cases for Authentication of Web Bots
draft-nottingham-webbotauth-use-cases-01
Abstract
This draft outlines use cases for authentication for bot clients on
the Web.
About This Document
This note is to be removed before publishing as an RFC.
Status information for this document may be found at
https://datatracker.ietf.org/doc/draft-nottingham-webbotauth-use-
cases/.
information can be found at https://mnot.github.io/I-D/.
Source for this draft and an issue tracker can be found at
https://github.com/mnot/I-D/labels/webbotauth-use-cases.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 22 July 2026.
Copyright Notice
Copyright (c) 2026 IETF Trust and the persons identified as the
document authors. All rights reserved.
Nottingham Expires 22 July 2026 [Page 1]
Internet-Draft webbotauth usecases January 2026
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Web Site Use Cases . . . . . . . . . . . . . . . . . . . . . 2
2.1. Mitigating Volumetric Abuse by Bots . . . . . . . . . . . 3
2.2. Controlling Access by Bots . . . . . . . . . . . . . . . 4
2.3. Providing Different Content to Bots . . . . . . . . . . . 5
2.4. Auditing Bot Behaviour . . . . . . . . . . . . . . . . . 5
2.5. Classifying Traffic . . . . . . . . . . . . . . . . . . . 6
2.6. Authenticating Site Services . . . . . . . . . . . . . . 6
3. Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . 7
4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
5. Security Considerations . . . . . . . . . . . . . . . . . . . 8
Appendix A. Bot Differences . . . . . . . . . . . . . . . . . . 8
A.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 8
A.2. Relationship . . . . . . . . . . . . . . . . . . . . . . 8
A.3. Reputation . . . . . . . . . . . . . . . . . . . . . . . 9
A.4. Agency . . . . . . . . . . . . . . . . . . . . . . . . . 9
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 9
1. Introduction
The Web Bot Auth (WebBotAuth) Working Group has been chartered to
"standardize methods for cryptographically authenticating automated
clients and providing additional information about their operators to
Web sites."
Initial discussions in the group have revealed some disagreement
about the scope and intent of the work. Section 2 explores the use
cases for authentication of non-browser clients, to help inform those
discussions. Section 3 suggests some further questions for
consideration.
2. Web Site Use Cases
This section explores use cases that Web sites might have for
authenticating bots, including a discussion of any current mechanisms
that they use to meet the use case.
Nottingham Expires 22 July 2026 [Page 2]
Internet-Draft webbotauth usecases January 2026
Because there is some question about the "additional information"
facility in the charter, each use case also assesses whether it's
necessary to identify a real-world entity associated with the bot to
meet the use case (since that is the most common use of such a
facility).
Each use case also summarises how controversial addressing it is
perceived to be.
This draft does not take a position on whether all of the use cases
should be addressed by the group. Potential alternative solutions to
the implied requirements are also not considered here.
2.1. Mitigating Volumetric Abuse by Bots
Some bots make requests at rates that cause operational issues for
Web sites. This may be intentional (e.g., traffic from "botnets" and
other attacks) or unintentional (due to overly simple or
inconsiderate implementation). It appears that both the number of
such bots and the rate at which they make requests are increasing.
While sites can take measures to mitigate the impact of this traffic
(e.g., caching), these are only partially effective; some resources
are uncacheable, and generating representations of some HTTP
resources can incur much higher costs. In general, serving such
great volumes of traffic can consume significant resources, in terms
of both infrastructure and bandwidth.
Currently, a site that experiences such traffic most often blocks
unwelcome clients by IP address. This has the effect of blocking
other uses of that IP address, both at that time and into the
indefinite future. It also offers little recourse for incorrectly
blocked clients, since they have no information about why they were
blocked or what they should do about it.
This use case does not require identifying a specific bot or
associating it with a real-world entity, provided that the site
considers abusiveness a feature of behaviour, not identity. It also
does not require discriminating between bots and non-bot users; only
the problematic behaviour is targeted.
Addressing this use case does not appear to be overly controversial,
because it is widely recognised that a site needs to operate with
reasonable efficiency to provide both its operators and its users a
benefit.
Nottingham Expires 22 July 2026 [Page 3]
Internet-Draft webbotauth usecases January 2026
2.2. Controlling Access by Bots
Some sites wish to make access by bots to the resources they provided
to browsers conditional upon the identity or features of the bot.
This might be for a variety of reasons; they may wish to:
* Only allow access by bots on an allow list;
* Disallow access to bots on an explicit deny list;
* Condition access upon meeting some criteria (e.g., non-profit,
certification by a third party);
* Condition access upon participation in some scheme or protocol
(e.g., payment for access);
Note that the first two imply some notion of bots being tied to a
real-world identity, whereas the remaining do not necessarily require
it.
Currently, sites most often use a combination of the Robots Exclusion
Protocol (including robots.txt) and IP address blocking to control
access by bots.
The Robots Exclusion Protocol provides a means for sites to
communicate preferences to bots about their behaviour. Although this
is a useful and sometimes necessary function, it does not allow for
enforcement of those preferences.
Enforcement is achieved primarily through blocking non-conforming
clients. The limitations of IP address blocking are discussed in
Section 2.1.
This use case has been disputed. While blocking certain bots by IP
address is widespread in practice, concerns have been expressed that
standardising an authentication mechanism for bots might result in a
Web where all bots might need to authenticate, leading to increased
difficulty in introducing new bots. In some markets, this outcome
could create pressure towards centralisation, due to heightened
barriers to entry.
Another controversy is that giving sites a more fine-grained
capability to block bots is a change in the balance of power on the
Web. Some perceive that as justified, given factors like the
introduction of AI and what they perceive as an onslaught of bot
traffic. Others see it as an overreach that may impinge upon users'
ability to consume content as they desire -- for example, using
accessibility or agentic tools.
Nottingham Expires 22 July 2026 [Page 4]
Internet-Draft webbotauth usecases January 2026
Finally, some see bots as a way of keeping powerful sites in check,
and therefore measures to curtail their activity is portrayed as
concentrating that power. However, it should be noted that there are
also powerful bots that can be seen to have disproportionate power
over sites, and so there is not necessarily a clear bias here.
2.3. Providing Different Content to Bots
Somes sites may wish to tailor the content they serve to bots (either
selectively or overall), as compared to that they serve to browsers.
In some cases, a site might wish to augment the information that they
provide to a trusted bot. Conversely, a site might wish to reduce or
modify the information that they provide to a bot that they do not
trust.
Current practice is difficult to ascertain, but anecdotal evidence
suggests that the latter case is more common than the former. For
example, some sites do not wish for information that they consider to
be commercially sensitive -- e.g., prices -- to be available to bots.
In both cases, IP addresses and similar heuristics are used.
In most cases, this use requires identifying a specific bot and
associating it with a real-world entity (although there are
exceptions, such as sites which want to treat all bots equally, or
cases where it's possible to group bots without identifying specific
ones).
This use case is likely to be controversial in cases where the
modifications are not consensual. Some espouse a site's right to
control its own speech depending upon the audience it is speaking to,
whereas others are concerned by the lack of transparency that might
result -- particularly from powerful sites. Note, however, that a
bot that cannot be distinguished from a typical browser is still
likely to be able to operate for such purposes.
2.4. Auditing Bot Behaviour
Some sites may wish to understand how bots use them in detail. In
particular, they might want to verify that a bot adheres to the
preferences stated in robots.txt, or that they conform to some other
protocol. They might also wish to have reliable metrics for how a
bot behaves in terms of number of requests, timing of requests, and
so forth to ascertain the bot's behaviour; this information might
feed into other use cases, or be used independently.
Nottingham Expires 22 July 2026 [Page 5]
Internet-Draft webbotauth usecases January 2026
Currently, this use case is met through use of heuristics of
information like IP address. It does not necessarily require
identifying a specific bot or associating it with a real-world
entity, but some (many?) of the downstream uses of the audit data
may.
This use case does not appear controversial, because bots being
accountable for their behaviour is broadly seen as a reasonable goal.
2.5. Classifying Traffic
Many sites make efforts to understand how browsers interact with
them, so as to improve their services. This might be at the
connection level (e.g., HTTP, TCP, and QUIC statistics), or it might
be gathered in-browser (Real User Monitoring or RUM).
When doing so, it is important for them to be able to distinguish
between their target audience (people using browsers) and bots; if
they cannot, the bot traffic will make the insights they gain less
useful (or even useless).
Currently, sites that perform such tasks use a variety of heuristics
to identify and exclude bots from such measures. This is only
partially effective; bots are increasingly difficult to classify,
particularly as using 'headless browsers' becomes a norm for
crawlers.
This use case does not require identifying specific bots or
associating them with real-world entities unless finer granularity of
classification than "bot vs not" is desired. However, sites that
wish to exclude non-human clients from their measurements would still
need to use heuristics for bots that do not comply with the norm.
Addressing this use case does not appear to be controversial, because
an understanding of the nature of traffic that a site receives is
important to its operation (provided that no personal information is
involved and no tracking capability is introduced).
2.6. Authenticating Site Services
Many sites use third-party tools to analyse, monitor, and provide
their services. For example, health check services allow sites to
understand their uptimes and receive notifications when there is a
problem. Content Delivery Networks need to identify themselves to
back-end origin servers.
Nottingham Expires 22 July 2026 [Page 6]
Internet-Draft webbotauth usecases January 2026
Currently, such services use a variety of means of authentication,
including IP address allow lists, "magic" header fields, and ad hoc
use of other existing mechanisms.
Site services often have higher requirements for reliability and
security. A site might not wish to grant access to a vulnerability
scanner solely based upon its IP address, for example. Likewise, a
health check needs to reliably bypass Web Application Firewalls to
perform its function.
This use case requires bot identity to be tied to authentication.
Addressing this use case does not appear to be controversial.
However, it is not clear whether these use cases are within the scope
of the Working Group's charter.
3. Next Steps
This section suggests questions for further investigation and
discussion.
1. What are the qualitative differences between current practice
(e.g. ad hoc blocking by IP address) and proposals for
authentication of bots?
2. User authentication is widespread and standards-supported on the
Web; what makes bot authentication different?
3. What levers do we have to mitigate the harms associated with an
emerging default of requiring authentication for bots? Does
authentication enhance or confound such efforts (as opposed to IP
address blocking)?
4. Would an authentication scheme that does not allow association
with real-world entities provide enough value to meet interesting
use cases? If so, would the charter prohibition on "[t]racking
or assigning reputation to particular bots" need to change?
5. What is the threshold for being considered a bot? E.g., is
request rate important? Associating with a specific human user
in time and/or space?
6. Are the resource requirements for authentication proposals
reasonable for these use cases for all types of sites? At the
meeting, it was asserted that it would disproportionately
advantage already well-resourced entities.
7. What use cases should the group address and not address? Why?
Nottingham Expires 22 July 2026 [Page 7]
Internet-Draft webbotauth usecases January 2026
8. Are there alternative approaches to addressing some or all of
these use cases? What properties do they have?
4. IANA Considerations
This draft has no actions for IANA.
5. Security Considerations
Undoubtedly there are security considerations to any authentication
protocol, but they will be encountered and dealt with later than
what's in scope for this draft.
Appendix A. Bot Differences
This section enumerates some of the ways that bots can differ.
A.1. Scope
Bots have different scopes of activity:
* Some crawl the entire Web
* Some target a specific subset of the Web (e.g., by geography,
language, industry)
* Some target specific sites or resources on sites (e.g., link
checkers, linters)
A.2. Relationship
Bots have different relationships with sites:
* Some actively attempt to appear as Web browsers, so as to have the
same relationship as an end user
* Some do not hide their nature as bots but do not have any pre-
existing relationship with the site
* Some are implicitly or explicitly authorised by the site (e.g.,
through an advertised API)
* Some have a pre-existing relationship with the site (e.g.,
monitoring and other site services)
Nottingham Expires 22 July 2026 [Page 8]
Internet-Draft webbotauth usecases January 2026
A.3. Reputation
Bots have different reputations in the larger community, which can
change how they are perceived by sites:
* Some are well and widely-known (e.g., search engine crawlers,
archivers)
* Some are relatively unknown (e.g., due to low traffic or recent
introduction)
* Some are purposefully anonymous (e.g., price checkers, most
malicious bots)
A.4. Agency
Bots act with different relationships to their operator(s):
* Some are explicitly and exclusively associated with an end user
(e.g., "agentic" bots)
* Some are acting on behalf of a group of end users
* Some are acting on behalf of another entity (e.g., corporation,
government, civil society organisation)
* Some serve multiple constituencies
Author's Address
Mark Nottingham
Melbourne
Australia
Email: mnot@mnot.net
URI: https://www.mnot.net/
Nottingham Expires 22 July 2026 [Page 9]