Skip to main content

A Vocabulary For Expressing AI Usage Preferences
draft-ietf-aipref-vocab-05

Document Type Active Internet-Draft (aipref WG)
Authors Paul Keller , Martin Thomson
Last updated 2025-12-01
Replaces draft-keller-aipref-vocab
RFC stream Internet Engineering Task Force (IETF)
Intended RFC status Proposed Standard
Formats
Additional resources Mailing list discussion
Stream WG state WG Document
Associated WG milestone
Aug 2026
A standard track specification describing vocabulary for expressing AI-related preferences to the IESG for publication
Document shepherd (None)
IESG IESG state I-D Exists
Consensus boilerplate Yes
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-ietf-aipref-vocab-05
AI Preferences                                                 P. Keller
Internet-Draft                                               Open Future
Intended status: Standards Track                         M. Thomson, Ed.
Expires: 4 June 2026                                             Mozilla
                                                         1 December 2025

            A Vocabulary For Expressing AI Usage Preferences
                       draft-ietf-aipref-vocab-05

Abstract

   This document defines a vocabulary for expressing preferences
   regarding how digital assets are used by automated processing
   systems.  This vocabulary allows for the declaration of restrictions
   or permissions for use of digital assets by such systems.

About This Document

   This note is to be removed before publishing as an RFC.

   The latest revision of this draft can be found at https://ietf-wg-
   aipref.github.io/drafts/draft-ietf-aipref-vocab.html.  Status
   information for this document may be found at
   https://datatracker.ietf.org/doc/draft-ietf-aipref-vocab/.

   Discussion of this document takes place on the AI Preferences Working
   Group mailing list (mailto:ai-control@ietf.org), which is archived at
   https://mailarchive.ietf.org/arch/browse/ai-control/.  Subscribe at
   https://www.ietf.org/mailman/listinfo/ai-control/.

   Source for this draft and an issue tracker can be found at
   https://github.com/ietf-wg-aipref/drafts.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

Keller & Thomson           Expires 4 June 2026                  [Page 1]
Internet-Draft          AI Preference Vocabulary           December 2025

   This Internet-Draft will expire on 4 June 2026.

Copyright Notice

   Copyright (c) 2025 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Conventions and Definitions . . . . . . . . . . . . . . . . .   3
   3.  Statements of Preference  . . . . . . . . . . . . . . . . . .   4
     3.1.  Conformance . . . . . . . . . . . . . . . . . . . . . . .   5
     3.2.  Applicability and Effect  . . . . . . . . . . . . . . . .   5
   4.  Vocabulary Definition . . . . . . . . . . . . . . . . . . . .   6
     4.1.  Foundation Model Production Category  . . . . . . . . . .   6
     4.2.  Search  . . . . . . . . . . . . . . . . . . . . . . . . .   6
     4.3.  Vocabulary Extensions . . . . . . . . . . . . . . . . . .   7
   5.  Applying Statements of Preference . . . . . . . . . . . . . .   7
     5.1.  Combining Preferences . . . . . . . . . . . . . . . . . .   8
     5.2.  More Specific Instructions  . . . . . . . . . . . . . . .   8
   6.  Exemplary Serialization Format  . . . . . . . . . . . . . . .   9
     6.1.  Usage Category Labels . . . . . . . . . . . . . . . . . .   9
     6.2.  Preference Labels . . . . . . . . . . . . . . . . . . . .   9
     6.3.  Text Encoding . . . . . . . . . . . . . . . . . . . . . .  10
     6.4.  Syntax Extensions . . . . . . . . . . . . . . . . . . . .  10
     6.5.  Processing Algorithm  . . . . . . . . . . . . . . . . . .  10
     6.6.  Alternative Formats . . . . . . . . . . . . . . . . . . .  12
   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  12
   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  12
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  12
     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  12
     9.2.  Informative References  . . . . . . . . . . . . . . . . .  13
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  13
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  13

Keller & Thomson           Expires 4 June 2026                  [Page 2]
Internet-Draft          AI Preference Vocabulary           December 2025

1.  Introduction

   This document defines a vocabulary of preferences regarding how
   automated systems process digital assets -- in particular, the
   training and use of AI models.  This vocabulary can be used to
   describe the types of uses that a declaring party may wish to
   explicitly restrict or allow.

   The vocabulary is intended to be used in jurisdictions where
   expressing preferences results in legal obligations, as well as where
   there are no associated legal obligations.  In either case,
   expressing preferences is without prejudice to applicable laws,
   including the applicability of exceptions and limitations to
   copyright.

   Section 3 defines the data model for AI Preferences.  Section 4
   defines the terms of the vocabulary.  Section 5 explains how to use
   AI Preferences in a data processing application, and Section 6
   describes a way to serialize preferences into a string.  Section 5
   describes a process for determining the preference for a category of
   use.

   [ATTACH] defines mechanisms to associate preferences with assets.
   Other means of association might be defined separately in the future.

2.  Conventions and Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

   This document uses the following terms:

   Artificial Intelligence (AI):
      An engineered system of sufficient complexity that, for a given
      set of human-defined objectives, learns from data to generate
      outputs such as content, predictions, recommendations, or
      decisions.
   AI Training:
      The application of machine learning to data to produce or improve
      a model for an artificial intelligence system.
   Asset:
      A digital file or stream of data, usually with associated
      metadata.
   Declaring party:
      The entity that expresses a preference with regards to an Asset.

Keller & Thomson           Expires 4 June 2026                  [Page 3]
Internet-Draft          AI Preference Vocabulary           December 2025

   Machine Learning (ML):
      The processing of data to produce or improve a model that encodes
      the relationship between the data and human-defined objectives.
   Search Application:
      A search application is a system that enables users locate items
      on the internet or in a specific data store.

3.  Statements of Preference

   The vocabulary is a set of categories, each of which is defined to
   cover a class of usage for assets.  Section 4 defines the core set of
   usage categories in detail.

   A statement of preference -- or usage preference -- is made about an
   asset.  A statement of preference follows a simple data model where a
   preference is assigned to each of the categories of use in the
   vocabulary.  A preference is either to allow or disallow the usage
   associated with the category.

   A statement of preference can indicate preferences about some, all,
   or none of the categories from the vocabulary.  This can mean that no
   preference is stated for a given usage category.

   Some categories describe a proper subset of the usages of other
   categories.  A preference that is stated for the more general
   category applies if no preference is stated for the more specific
   category.

   For example, a more general category might be assigned a preference
   that allows the associated usage.  In the absence of any statement of
   preference regarding categories that are more specific subsets of
   that usage category, usage within those categories would be also be
   allowed.  An explicit preference regarding the more specific usage
   category can be used to disallow the more specific usage, while
   indicating that other usage within the more general category is
   permissible.

   After processing a statement of preferences the recipient associates
   each category of use one of three preference values: "allowed",
   "disallowed", or "unknown".  In the absence of a statement of
   preference, all usage categories are assigned a preference value of
   "unknown".

   The process for consulting a statement of preference is defined in
   Section 5.

Keller & Thomson           Expires 4 June 2026                  [Page 4]
Internet-Draft          AI Preference Vocabulary           December 2025

   Different declaring parties might each make their own statement of
   preference regarding a particular asset.  The process for managing
   multiple statements of preference is defined in Section 5.1.

   An exemplary syntax for statements of preference is defined in
   Section 6.

3.1.  Conformance

   This document and [ATTACH] describe how statements of preference are
   associated with assets.  An implementation is conformant to these
   specifications if it correctly follows all normative requirements
   that apply to it.

   The process of obtaining a statement of preference has very limited
   scope for variation between implementations.

3.2.  Applicability and Effect

   This specification provides a set of definitions for different
   categories of use, plus a system for associating simple preferences
   to each (allow, disallow, or no preference; see Section 3).

   This specification does not provide any enforcement mechanism for
   those preferences, and conformance to it does not encompass whether
   preferences are actually respected during data processing.

   Preferences do not themselves create rights or prohibitions, either
   in the positive or the negative.  Other mechanisms—technical, legal,
   contractual, or otherwise—might enforce stated preferences and
   thereby determine the consequences of following or not following a
   stated preference.

   An entity that receives usage preferences MAY choose to respect those
   preferences it has discovered, according to an understanding of how
   the asset is used, how that usage corresponds to the usage categories
   where preferences have been stated, and the applicable legal context.

   Usage preferences can be ignored due to express agreements between
   relevant parties, explicit provisions of law, or the exercise of
   discretion in situations where widely recognized priorities justify
   doing so.  Priorities that could justify ignoring preferences
   include—but are not limited to—free expression, safety, education,
   scholarship, research, preservation, interoperability, and
   accessibility.

Keller & Thomson           Expires 4 June 2026                  [Page 5]
Internet-Draft          AI Preference Vocabulary           December 2025

   The following lists examples of cases where other priorities could
   lead someone to ignore expressed preferences in a particular
   situation:

   *  People with accessibility needs, or organizations working on their
      behalf, might decide to ignore a preference in order to access
      automated captions or generate accessible formats.

   *  A cultural heritage organization might decide to ignore a
      preference in order to provide more useful, reliable, or
      discoverable access to historical web collections.

   *  An educational institution might decide to ignore a preference in
      order to enable scholars to develop or use tools to facilitate
      scientific or other types of research.

   *  A website that permits user uploads might decide to ignore a
      preference in order to develop or use tools that detect harmful
      content according to established terms of use.

   Because enforcement is not provided by this specification, the
   consequences of ignoring preferences could vary depending upon how a
   given legal jurisdiction recognizes preferences.

4.  Vocabulary Definition

   This section defines the categories of use in the vocabulary.

4.1.  Foundation Model Production Category

   The act of using an asset to train or fine-tune a foundation model.

   Foundation models are large models that are produced using deep
   learning or other machine learning techniques.  Foundation models are
   trained on very large numbers of assets so that they can be applied
   to a wide range of use cases.  Foundation models typically possess
   generative capabilities in one or more media.

   Fine-tuning can specialize a general-purpose foundation model for a
   narrower set of use cases.

4.2.  Search

   Using one or more assets in a search application that directs users
   to the location from which the assets were retrieved.

   The presentation of any asset that is included in search output
   includes the following conditions:

Keller & Thomson           Expires 4 June 2026                  [Page 6]
Internet-Draft          AI Preference Vocabulary           December 2025

   *  A reference to the location that the asset was obtained is
      presented as part of the output.

   *  The asset can only be represented in the output with excerpts that
      are drawn verbatim from it.

   An asset can be used in ranking, but not present in output.

   Internal processing of assets to perform ranking and presentation can
   include the use and training of AI models.  This only includes any
   training that is necessary to produce models used in the search
   application.

   With both these conditions, a preference to allow Search usage
   enables the presentation of links and titles in what is considered
   "traditional" search results.

4.3.  Vocabulary Extensions

   Extensions to this vocabulary need to be defined in an RFC that
   updates this document.

   Any future extensions to this vocabulary MUST NOT introduce
   additional categories that include existing categories defined in the
   vocabulary.  That is, new categories of use can be defined as a
   subset of an existing category, but not a superset.

   Systems that use this vocabulary might define their own extensions as
   part of a larger data model.  Section 6.6 describes how concepts from
   an alternative format might be mapped to this vocabulary.

5.  Applying Statements of Preference

   After acquiring a statement of preference, which might use the
   process in Section 6.5, an application can determine the status of a
   specific usage category as follows:

   1.  If the statement of preference contains an explicit preference
       regarding that category of use -- either to allow or disallow --
       that is the result.

   2.  Otherwise, if the usage category is a proper subset of another
       usage category, recursively apply this process to that category
       and use the result of that process.

   3.  Otherwise, no preference is stated.

Keller & Thomson           Expires 4 June 2026                  [Page 7]
Internet-Draft          AI Preference Vocabulary           December 2025

   This process results in one of three potential answers: allow,
   disallow, and unknown.  Applications can use the answer to guide
   their behavior.

   One approach for dealing with an "unknown" outcome is to assign a
   default value.  This document takes no position on what default might
   be assigned.

5.1.  Combining Preferences

   The application might have multiple statements of preference,
   obtained using different methods or from different declaring parties.
   This might result in conflicting answers.

   Absent some other means of resolving conflicts, the following process
   applies to each usage category:

   *  If any statement of preference indicates that the usage is
      disallowed, the result is that the usage is disallowed.

   *  Otherwise, if any statement of preference allows the usage, the
      result is that the usage is allowed.

   *  Otherwise, no preference is stated.

   This process ensures that the most restrictive preference applies.

5.2.  More Specific Instructions

   A recipient of a statement of preferences that follows the model in
   Section 3 might receive more specific instructions in two ways:

   *  Extensions to the vocabulary might define more specific categories
      of usage.  Preferences about more specific categories override
      those of any more general category.

   *  Contractual agreements or other specific arrangements might
      override statements of preference.

   For instance, a statement of preferences might indicate a preference
   to disallow a category of use for an asset.  If arrangements, such as
   legal agreements, exist that explicitly permit the use of that asset,
   those arrangements likely apply despite the existence of machine-
   readable statements of preference, unless the terms of the
   arrangement explicitly say otherwise.

Keller & Thomson           Expires 4 June 2026                  [Page 8]
Internet-Draft          AI Preference Vocabulary           December 2025

6.  Exemplary Serialization Format

   This section defines an exemplary serialization format for
   preferences.  The format describes how the abstract model could be
   turned into Unicode text or sequence of bytes.

   The format relies on the Dictionary type defined in Section 3.2 of
   [FIELDS].  The dictionary keys correspond to usage categories and the
   dictionary values correspond to explicit preferences, which can be
   either y or n; see Section 6.2.

   For example, the following states a preference to allow foundation
   model production (Section 4.1), disallow search (Section 4.2), and
   and states no preference for other categories other than subsets of
   these categories:

   train-ai=y, search=n

6.1.  Usage Category Labels

   Each usage category in the vocabulary (Section 4) is mapped to a
   short textual label.  Table 1 tabulates this mapping.

         +=============================+==========+=============+
         | Category                    | Label    | Reference   |
         +=============================+==========+=============+
         | Foundation Model Production | train-ai | Section 4.1 |
         +-----------------------------+----------+-------------+
         | Search                      | search   | Section 4.2 |
         +-----------------------------+----------+-------------+

                     Table 1: Mappings for Categories

   These tokens are case sensitive.

   Tokens defined for a new usage category can only use lowercase latin
   characters (a-z), digits (0-9), "_", "-", ".", or "*".  These are
   encoded using the mappings in [ASCII].

6.2.  Preference Labels

   The data model in Section 3 used has two options for preferences
   associated with each category: allow and disallow.  These are mapped
   to single byte Tokens (Section 3.3.4 of [FIELDS]) of y and n,
   respectively.

Keller & Thomson           Expires 4 June 2026                  [Page 9]
Internet-Draft          AI Preference Vocabulary           December 2025

6.3.  Text Encoding

   Structured Fields [FIELDS] describes a byte-level encoding of
   information, not a text encoding.  This makes this format suitable
   for inclusion in any protocol or format that carries bytes.

   Some formats are defined in terms of strings rather than bytes.
   These formats might need to decode the bytes of this format to obtain
   a string.  As the syntax is limited to ASCII [ASCII], an ASCII
   decoder or UTF-8 decoder [UTF8] can be used.  This results in the
   strings that this document uses.

   Processing (see Section 6.5) requires a sequence of bytes, so any
   format that uses strings needs to encode strings first.  Again, this
   process can use ASCII or UTF-8.

6.4.  Syntax Extensions

   There are two ways by which this syntax might be extended: the
   addition of new labels and the addition of parameters.

   New labels might be defined to correspond to new usage categories.
   Section 4.3 addresses the considerations for defining new categories.
   New labels might also be defined for other types of extension that do
   not assign a preference to a usage category.  In either case, when
   processing a parsed Dictionary to obtain preferences, any unknown
   labels MUST be ignored.

   The Dictionary syntax (Section 3.2 of [FIELDS]) can associate
   parameters with each key-value pair.  This document does not define
   any semantics for any parameters that might be included.  When
   processing a parsed Dictionary to obtain preferences, any unknown
   parameters MUST be ignored.

   In either case, new extensions need to be defined in an RFC that
   updates this document.

6.5.  Processing Algorithm

   To process a series of bytes to recover the stated preferences, those
   bytes are parsed into a Dictionary (Section 4.2.2 of [FIELDS]), then
   preferences are assigned to each usage category in the vocabulary.

   This algorithm produces a keyed collection of values, where each key
   has at most one value and optional parameters.

Keller & Thomson           Expires 4 June 2026                 [Page 10]
Internet-Draft          AI Preference Vocabulary           December 2025

   To obtain preferences, iterate through the defined categories in the
   vocabulary.  For the label that corresponds to that category (see
   Table 1), obtain the corresponding value from the collection,
   disregarding any parameters.  A preference is assigned as follows:

   *  If the value is a Token with a value of y, the associated
      preference is to allow that category of use.

   *  If the value is a Token with a value of n, the associated
      preference is to disallow that category of use.

   *  Otherwise, no preference is stated for that category of use.

   Note that this last alternative includes the key being absent from
   the collection, values that are not Tokens, and Token values that are
   other than y or n.  All of these are not errors, they only result in
   no preference being inferred.

   It is important to note that if the same key appears multiple times,
   only the last value is taken.  This means that duplicating a key
   could result in unexpected outcomes.  For example, the following
   expresses no preferences:

   train-ai=y, train-ai="n", search=n, search

   If the parsing of the Dictionary fails, no preferences are stated.
   This includes where keys include uppercase characters, as this format
   is case sensitive (more correctly, it operates on bytes, not
   strings).

   This document does not define a use for parameters.  Where parameters
   are used, only those parameters associated with the value that is
   selected according to Section 4.2.2 of [FIELDS].  Parameters can
   therefore be carried for any preference value, including where no
   preference is expressed.

   For example, the following train-ai preference has parameters even
   though no preference is expressed:

   train-ai;has;parameters="?";

   This process produces an abstract data model that assigns a
   preference to each usage category as described in Section 3.

Keller & Thomson           Expires 4 June 2026                 [Page 11]
Internet-Draft          AI Preference Vocabulary           December 2025

6.6.  Alternative Formats

   This format is only an exemplary way to represent preferences.  The
   data model described in Section 3, can be used without this
   serialization.

   Any alternative format needs to define the mapping both from that
   format to the model used in this document and from the model to the
   alternative format.  This includes any potential for extensions
   (Section 6.4).

   The mapping between the data model and the alternative format does
   not need to be complete, it only needs to be clear and unambiguous.

   For example, an alternative format might only provide the ability to
   convey preferences for a subset of the categories of use.  A mapping
   might then define that no preference is associated with other
   categories.

7.  Security Considerations

   Preferences are not a security mechanism.  Section 3.2 addresses what
   it means to express a preference.

   Processing a concrete instantiation of the exemplary format described
   in Section 6 is subject to the security considerations in Section 6
   of [FIELDS].

8.  IANA Considerations

   This document has no IANA actions.

9.  References

9.1.  Normative References

   [ASCII]    Cerf, V., "ASCII format for network interchange", STD 80,
              RFC 20, DOI 10.17487/RFC0020, October 1969,
              <https://www.rfc-editor.org/rfc/rfc20>.

   [FIELDS]   Nottingham, M. and P. Kamp, "Structured Field Values for
              HTTP", RFC 9651, DOI 10.17487/RFC9651, September 2024,
              <https://www.rfc-editor.org/rfc/rfc9651>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

Keller & Thomson           Expires 4 June 2026                 [Page 12]
Internet-Draft          AI Preference Vocabulary           December 2025

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

9.2.  Informative References

   [ATTACH]   Illyes, G. and M. Thomson, Ed., "A Vocabulary For
              Expressing AI Usage Preferences", Work in Progress,
              Internet-Draft, draft-ietf-aipref-attach-00, 1 December
              2025, <https://datatracker.ietf.org/doc/html/draft-ietf-
              aipref-attach-00>.

   [UTF8]     Yergeau, F., "UTF-8, a transformation format of ISO
              10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
              2003, <https://www.rfc-editor.org/rfc/rfc3629>.

Acknowledgments

   The following individuals made significant contributions to this
   document:

   *  Lila Bailey
   *  Cullen Miller
   *  Laurent Le Meur
   *  Krishna Madhavan
   *  Felix Reda
   *  Leonard Rosenthol
   *  Sebastian Posth
   *  Erin Simon
   *  Timid Robot Zehta

Authors' Addresses

   Paul Keller
   Open Future
   Email: paul@openfuture.eu

   Martin Thomson (editor)
   Mozilla
   Email: mt@lowentropy.net

Keller & Thomson           Expires 4 June 2026                 [Page 13]