Skip to main content

Robots Exclusion Protocol Extension to manage AI content use
draft-canel-robots-ai-control-00

Document Type Active Internet-Draft (individual)
Authors Fabrice Canel , Krishna Madhavan
Last updated 2024-10-21
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-canel-robots-ai-control-00
Internet Engineering Task Force                            F. Canel, Ed.
Internet-Draft                                               K. Madhavan
Updates: 9309 (if approved)                        Microsoft Corporation
Intended status: Informational                           21 October 2024
Expires: 24 April 2025

      Robots Exclusion Protocol Extension to manage AI content use
                    draft-canel-robots-ai-control-00

Abstract

   This document extends RFC9309 by specifying additional rules for
   controlling usage of the content in the field of Artificial
   Intelligence (AI).

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 24 April 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Canel & Madhavan          Expires 24 April 2025                 [Page 1]
Internet-Draft  Robots Exclusion Protocol Extension to m    October 2024

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Requirements Language . . . . . . . . . . . . . . . . . . . .   2
   3.  Specification . . . . . . . . . . . . . . . . . . . . . . . .   2
     3.1.  Robots Control Rules  . . . . . . . . . . . . . . . . . .   2
     3.2.  Application Layer Response Header . . . . . . . . . . . .   3
     3.3.  HTML Meta Element . . . . . . . . . . . . . . . . . . . .   3
   4.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   3

1.  Introduction

   While the Robots Exclusion Protocol enables service owners to control
   how, if at all, automated clients known as crawlers may access the
   URIs on their services as defined by [RFC8288], the protocol doesn't
   provide controls on how the data returned by their service may be
   used in training generative AI foundation models.

   Application developers are requested to honor these tags.  The tags
   are not a form of access authorization however.

2.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

3.  Specification

3.1.  Robots Control Rules

   The possible values of the rules complementing existing allow,
   disallow rules are:

      DisallowAITraining - instructs the parser to not use the data for
      AI training language model.

      AllowAITraining - instructs the parser that the data can be used
      for AI training language model.

   The values are case insensitive and honor the same matching logic as
   Allow and disallow rules.  When Allow and Disallow rules define if
   the content can be downloaded, AllowAITraining and DisallowAITraining
   rules only apply rules on usage of the content for AI training.

Canel & Madhavan          Expires 24 April 2025                 [Page 2]
Internet-Draft  Robots Exclusion Protocol Extension to m    October 2024

3.2.  Application Layer Response Header

   The same rules can also be set in the Application Layer Response
   Header:

      DisallowAITraining - instructs the parser to not use the data for
      AI training language model.

      AllowAITraining - instructs the parser that the data can be used
      for AI training language model.

   The values are case insensitive and honor the same matching logic as
   Allow and disallow rules.

3.3.  HTML Meta Element

   Same rules can also be set via an HTML meta tag:

      <meta name="robots" content="DisallowAITraining">

      <meta name="examplebot" content="AllowAITraining">

4.  IANA Considerations

   TODO: https://www.rfc-editor.org/rfc/rfc9110.html#name-field-name-
   registry

Canel & Madhavan          Expires 24 April 2025                 [Page 3]