Skip to main content

AI Agent Discovery and Invocation Protocol
draft-cui-ai-agent-discovery-invocation-01

Document Type Active Internet-Draft (individual)
Authors Yong Cui , Yihan Chao , Chenguang Du
Last updated 2026-02-12
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-cui-ai-agent-discovery-invocation-01
Network Working Group                                             Y. Cui
Internet-Draft                                       Tsinghua University
Intended status: Informational                                   Y. Chao
Expires: 16 August 2026                                            C. Du
                                                 Zhongguancun Laboratory
                                                        12 February 2026

               AI Agent Discovery and Invocation Protocol
               draft-cui-ai-agent-discovery-invocation-01

Abstract

   This document proposes a standardized protocol for discovery and
   invocation of AI agents.  It defines a common metadata format for
   describing AI agents (including capabilities, I/O specifications,
   supported languages, tags, authentication methods, etc.), a
   capability-based discovery mechanism, and a unified RESTful
   invocation interface.

   This revision additionally specifies an optional extension that
   enables intent-based agent selection prior to discovery and
   invocation, without changing existing discovery or invocation
   semantics.

   The goal is to enable cross-platform interoperability among AI agents
   by providing a discover-and-match mechanism and a unified invocation
   entry point.  Security considerations, including authentication and
   trust measures, are also discussed.  This specification aims to
   facilitate the formation of multi-agent systems by making it easy to
   find the right agent for a task and invoke it in a consistent manner
   across different vendors and platforms.

About This Document

   This note is to be removed before publishing as an RFC.

   The latest revision of this draft can be found at
   https://example.com/LATEST.  Status information for this document may
   be found at https://datatracker.ietf.org/doc/draft-cui-ai-agent-
   discovery-invocation/.

   Discussion of this document takes place on the WG Working Group
   mailing list (mailto:WG@example.com), which is archived at
   https://example.com/WG.

   Source for this draft and an issue tracker can be found at
   https://github.com/USER/REPO.

Cui, et al.              Expires 16 August 2026                 [Page 1]
Internet-Draft                    AIDIP                    February 2026

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 16 August 2026.

Copyright Notice

   Copyright (c) 2026 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Conventions and Definitions . . . . . . . . . . . . . . . . .   4
   3.  Agent Metadata Specification  . . . . . . . . . . . . . . . .   5
     3.1.  Core Fields . . . . . . . . . . . . . . . . . . . . . . .   5
     3.2.  Operations and I/O Schema . . . . . . . . . . . . . . . .   7
     3.3.  Example Agent Metadata  . . . . . . . . . . . . . . . . .   8
   4.  Agent Discovery Mechanism . . . . . . . . . . . . . . . . . .   8
     4.1.  Registry Overview . . . . . . . . . . . . . . . . . . . .   8
     4.2.  Agent Registration  . . . . . . . . . . . . . . . . . . .   9
     4.3.  Querying Agents . . . . . . . . . . . . . . . . . . . . .   9
       4.3.1.  Attribute-Based Query (Filter)  . . . . . . . . . . .  10
       4.3.2.  Semantic Query (Natural Language Search)  . . . . . .  10
       4.3.3.  Retrieve Single Agent . . . . . . . . . . . . . . . .  11
   5.  Agent Invocation  . . . . . . . . . . . . . . . . . . . . . .  11
     5.1.  Invocation Request  . . . . . . . . . . . . . . . . . . .  11

Cui, et al.              Expires 16 August 2026                 [Page 2]
Internet-Draft                    AIDIP                    February 2026

     5.2.  Invocation Response . . . . . . . . . . . . . . . . . . .  12
     5.3.  Additional Considerations for Invocation  . . . . . . . .  14
   6.  Agent Semantic Resolution . . . . . . . . . . . . . . . . . .  15
     6.1.  Non-Goals . . . . . . . . . . . . . . . . . . . . . . . .  15
   7.  Semantic Routing Platform . . . . . . . . . . . . . . . . . .  15
   8.  Backward Compatibility  . . . . . . . . . . . . . . . . . . .  16
   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  16
   10. Example Interaction Flow  . . . . . . . . . . . . . . . . . .  16
   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  17
   12. Normative References  . . . . . . . . . . . . . . . . . . . .  17
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  17
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17

1.  Introduction

   As artificial intelligence technologies advance rapidly, AI
   agents—autonomous software components capable of perceiving their
   environment, reasoning, and taking actions to achieve goals—have
   emerged as a powerful paradigm for task execution.  Today, many
   organizations develop specialized AI agents for various purposes:
   from text translation and summarization, to code generation, to data
   analysis and beyond.  These agents are often offered as services,
   accessible over the network and may be integrated into larger
   systems.  However, despite the proliferation of AI agents, there is
   currently no standard protocol for discovering available agents and
   invoking their capabilities in a uniform way.

   Existing agent frameworks and platforms facilitate building agents
   but typically operate in isolated ecosystems, making cross-platform
   or cross-organization agent interoperability difficult.  Each
   platform tends to define its own APIs for agent description and
   invocation, which means a client wishing to use agents from multiple
   sources must adapt to disparate interfaces.  This lack of
   standardization creates friction, increases integration costs, and
   hampers the development of multi-agent collaborative systems.

   This document addresses these issues by proposing a standardized AI
   Agent Discovery and Invocation Protocol.  The protocol provides:

   1.  *Agent Metadata Specification:* A structured JSON Schema for
       describing an agent's identity, capabilities, inputs, outputs,
       authentication requirements, and other attributes.  This enables
       agents to publish their specifications in a machine-readable
       form.

Cui, et al.              Expires 16 August 2026                 [Page 3]
Internet-Draft                    AIDIP                    February 2026

   2.  *Discovery Mechanism:* A registry-based approach where agents
       register themselves and clients can search for agents by
       capability, tags, or semantic queries.  The registry is language
       and platform agnostic, facilitating cross-platform discovery.

   3.  *Invocation Interface:* A RESTful API that enables a client
       (which could be a human user application, another agent, or an
       orchestration system) to invoke an agent's capabilities through a
       standard endpoint and JSON payloads.

   4.  *Security Considerations:* Guidelines for authentication,
       authorization, encrypted transport (TLS), and trust
       establishment, ensuring that discovery and invocation happen
       securely.

   5.  *Interoperability with Existing Standards:* This specification
       references existing standards such as JSON Schema, OAuth 2.0
       [RFC6749], and OpenAPI concepts, and leverages established web
       technologies for broad compatibility.

   The primary audience for this specification includes developers of AI
   agent platforms, providers of AI agent services, and system
   architects building AI-enabled applications or multi-agent systems.
   By adopting this protocol, an AI agent developer can make their agent
   accessible to a wide ecosystem, and a client application can
   integrate AI agents from multiple vendors without custom integration
   for each.

   This revision extends the base protocol with an optional Agent
   Semantic Resolution layer that enables intent-based agent selection.
   This extension allows a Host Agent or coordinator to describe a task
   intent and receive candidate agents without predetermining which
   agent to invoke.  ASR does not replace discovery; it adds a semantic
   matching phase that can precede or augment the capability-based
   search defined in earlier sections.

2.  Conventions and Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

   *  *AI Agent:* An autonomous software component that can perform
      tasks using artificial intelligence capabilities.  Agents may wrap
      language models, specialized ML models, or reasoning engines,
      exposing their abilities via defined interfaces.

Cui, et al.              Expires 16 August 2026                 [Page 4]
Internet-Draft                    AIDIP                    February 2026

   *  *Agent Metadata:* A structured description of an agent, including
      name, description, capabilities, input/output schemas,
      authentication requirements, and endpoint information.

   *  *Agent Registry (Discovery Service):* A service that maintains a
      directory of registered agents and supports queries for
      discovering agents by attributes or semantic search.

   *  *Gateway:* (Optional) An intermediary service that routes client
      requests to appropriate agents.  In some deployments, the registry
      or another service acts as a gateway to simplify client-to-agent
      connections.

   *  *Invocation Endpoint:* The URL provided by an agent (or gateway)
      where clients send requests to invoke the agent's capabilities.

   *  *Capability:* A high-level function an agent can perform,
      identified by a string (e.g., "translation", "summarization",
      "image_classification").

   *  *Operation:* A specific action supported by an agent.  Some agents
      may have multiple operations; for example, an agent offering both
      translation and language detection could list these as separate
      operations, each with its own input/output schema.

3.  Agent Metadata Specification

   The Agent Metadata Specification defines a standard JSON document
   that describes an agent.  All agents that wish to be discoverable and
   invocable through this protocol MUST provide a metadata document
   conforming to the schema below.  This metadata is used for agent
   registration and returned to clients during discovery.

3.1.  Core Fields

   The following are the core fields of an agent metadata document:

   *  *id (string):* A globally unique identifier for the agent.  This
      could be a UUID or a similarly unique value, assigned by the
      registry upon registration or by the agent provider in advance.
      This ID is used to refer to the agent in all subsequent operations
      (e.g., retrieval, invocation routing).

   *  *name (string):* A human-readable name for the agent (e.g.,
      "Chinese-English Translator Agent").  Names need not be unique but
      should be descriptive.

Cui, et al.              Expires 16 August 2026                 [Page 5]
Internet-Draft                    AIDIP                    February 2026

   *  *description (string):* A detailed description of the agent, its
      purpose, and capabilities in natural language.  This helps both
      human users and semantic search algorithms understand what the
      agent does.

   *  *version (string):* The version of the agent or its metadata
      (e.g., "1.0.0").  This allows tracking of agent updates over time.

   *  *publisher (string):* The name or identifier of the entity
      publishing the agent (e.g., an organization name or developer
      name).

   *  *capabilities (array of strings):* A list of capabilities the
      agent supports.  Capabilities are high-level descriptors (like
      tags or categories) that clients can filter by.  Examples:
      ["translation", "summarization", "text_generation"].

   *  *tags (array of strings):* Additional tags for search and
      categorization (e.g., ["nlp", "chinese", "transformer_model",
      "cloud"]).  Tags differ from capabilities in that they can include
      broader or orthogonal categories (like domain, language support,
      deployment model, etc.).

   *  *endpoint (string):* The URL of the agent's invocation endpoint.
      If the agent is behind a gateway, this could be either the direct
      endpoint or, if direct access is not allowed, a gateway path
      (e.g., the gateway might provide a unified endpoint like
      /agents/{id}/invoke and internally route to the actual agent
      endpoint).

   *  *supported_languages (array of strings, optional):* A list of
      languages the agent supports (e.g., ["en", "zh", "fr"]).  For
      agents dealing with natural language tasks, this field indicates
      which languages are handled.  If omitted, the agent is either
      language-agnostic or should not be filtered by language.

   *  *authentication (object, optional):* Describes the authentication
      mechanism required to invoke the agent.  This object may include:

      -  *type (string):* e.g., "api_key", "oauth2_bearer", "mtls"
         (mutual TLS), "none".

      -  *instructions (string):* Human-readable note or URL for
         obtaining credentials.

      -  *scopes (array of strings):* If OAuth 2.0 is used, the OAuth
         scopes required.

Cui, et al.              Expires 16 August 2026                 [Page 6]
Internet-Draft                    AIDIP                    February 2026

      If no authentication is required, this field can be omitted or set
      with type: "none".

   *  *status (string, optional):* Operational status of the agent
      (e.g., "active", "inactive", "deprecated").  The registry may use
      this to filter out agents that are not currently available.

   *  *additional fields:* Additional fields may include metadata about
      rate limits (e.g., max calls per minute), pricing info (if the
      agent charges per use), or links to documentation.  These are not
      standardized here but can be included in agent metadata as needed.

3.2.  Operations and I/O Schema

   Each agent MUST describe its input and output formats.  This is done
   using the *operations* field:

   *  *operations (array of objects):* A list of operations the agent
      supports.  Each operation object has:

      -  *name (string):* The operation name/identifier (e.g.,
         "translateText", "summarize").

      -  *description (string):* A description of what the operation
         does.

      -  *inputs (object):* A JSON Schema describing the expected input.
         This allows clients to understand what data to send.  The JSON
         Schema can specify required fields, types, enums, etc.

      -  *outputs (object):* A JSON Schema describing the output format.

      -  *examples (array of objects, optional):* Example input/output
         pairs.  Each example is an object {"input": {...}, "output":
         {...}} showing a sample invocation.

   If an agent has a single operation, this array will have one element.
   If it can do multiple distinct tasks, each is listed here.  Some
   agents may not have structured operations (e.g., a general-purpose
   language model that just produces text).  In that case, the
   operations field might include a generic operation like {"name":
   "generate", ...}.

Cui, et al.              Expires 16 August 2026                 [Page 7]
Internet-Draft                    AIDIP                    February 2026

   If an agent has a single operation, this array will have one element.
   If it can do multiple distinct tasks, each is listed here.  For
   simple agents with one primary function, an alternative is to use
   top-level inputs and outputs fields directly (instead of an
   operations array).  In that case, the whole agent effectively has one
   implied operation.  This spec allows both styles, but using
   operations is recommended for future extensibility.

3.3.  Example Agent Metadata

   Below is an example metadata JSON for a translation agent:

   json { "id": "agent-12345", "name": "Chinese-English Translator",
   "description": "Translates text between Chinese and English with high
   accuracy using a fine-tuned model.", "version": "1.2.0", "publisher":
   "ExampleAI Inc.", "capabilities": ["translation"], "tags": ["nlp",
   "chinese", "english", "cloud"], "endpoint":
   "https://api.example.com/agents/translate", "supported_languages":
   ["en", "zh"], "authentication": { "type": "api_key", "instructions":
   "Include 'X-API-Key' header with your API key." }, "status":
   "active", "operations": [ { "name": "translateText", "description":
   "Translates text from source language to target language.", "inputs":
   { "type": "object", "properties": { "text": {"type": "string"},
   "source_language": {"type": "string", "enum": ["en", "zh"]},
   "target_language": {"type": "string", "enum": ["en", "zh"]} },
   "required": ["text", "source_language", "target_language"] },
   "outputs": { "type": "object", "properties": { "translated_text":
   {"type": "string"} } }, "examples": [ { "input": {"text": "你好世界",
   "source_language": "zh", "target_language": "en"}, "output":
   {"translated_text": "Hello World"} } ] } ] }

   This metadata tells us the agent is an active translation agent for
   Chinese and English, requires an API key for authentication, and has
   one operation translateText with a clear input/output schema.

4.  Agent Discovery Mechanism

   The discovery mechanism allows clients to find agents that meet
   certain criteria.  Discovery is provided by an Agent Registry (or
   Discovery Service) that aggregates metadata from multiple agents.

4.1.  Registry Overview

   The Agent Registry is a network-accessible service that:

   1.  Allows agents (or their administrators) to register metadata
       about the agent.

Cui, et al.              Expires 16 August 2026                 [Page 8]
Internet-Draft                    AIDIP                    February 2026

   2.  Stores and indexes these metadata entries for efficient search.

   3.  Provides endpoints for clients to query and retrieve agent
       information.

   A registry may be operated by an organization for its internal
   agents, or by a third party acting as a directory of agents across
   multiple providers.  Multiple registries can coexist;
   interoperability between registries is facilitated by consistent
   metadata formats, though formal registry federation is out of scope
   for this draft.

4.2.  Agent Registration

   An agent (or its administrator) registers with the registry by
   sending its metadata to a registration endpoint:

   *  *Endpoint:* POST /agents

   *  *Request Body:* The agent metadata JSON document.

   *  *Response:* On success, the registry returns *201 Created* (if a
      new agent was added) or *200 OK* (if an existing agent was
      updated), with the stored agent metadata (including the assigned
      id if the agent did not provide one).  If validation fails (e.g.,
      missing required fields), the registry returns a *400 Bad Request*
      with error details.

   The registry MUST validate the metadata against the schema.
   Registration may require authentication (for example, the registry
   only allows verified publishers to register agents).

   Updates to an agent's metadata (e.g., a new version, changed
   endpoint, etc.) can be done via a PUT request to the agent's entry:

   *  *Endpoint:* PUT /agents/{id}

   *  *Request Body:* Updated metadata.

   *  *Response:* *200 OK* on success; *404 Not Found* if no agent with
      that ID exists; *403 Forbidden* if the requester is not authorized
      to update that agent.

4.3.  Querying Agents

   Clients query the registry using the search endpoint.  The protocol
   supports two types of queries:

Cui, et al.              Expires 16 August 2026                 [Page 9]
Internet-Draft                    AIDIP                    February 2026

4.3.1.  Attribute-Based Query (Filter)

   Clients can specify criteria to filter agents by capabilities, tags,
   supported languages, etc.

   *  *Endpoint:* GET /agents?capabilities=X&tags=Y&language=Z

   *  or a structured query via *POST /agents/search* with a JSON body.

   For simplicity, *POST /agents/search* is recommended for more complex
   queries.  The body might look like:

   json { "filters": { "capabilities": ["translation"],
   "supported_languages": ["en", "zh"], "tags": ["nlp"] }, "top": 10 }

   This returns up to 10 agents that match all the specified filters.
   Filters are combined with AND logic (the agent must satisfy all
   conditions).  Capabilities and tags are matched by set intersection
   (the agent must have at least the ones listed).

   The response is a JSON array of agent summary objects:

   json [ { "id": "agent-12345", "name": "Chinese-English Translator",
   "description": "...", "endpoint": "https://api.example.com/agents/
   translate", "capabilities": ["translation"] }, ... ]

   Summary objects include essential fields to help the client decide
   which agent to use, without returning the full detailed metadata.  A
   client can retrieve full metadata via the single-agent endpoint.

4.3.2.  Semantic Query (Natural Language Search)

   In addition to attribute-based search, the registry MAY support
   semantic search where the client describes a need in natural
   language, and the registry uses AI techniques (embeddings, LLM-based
   matching) to find relevant agents.  This is an optional feature for
   registries.

   *  *Endpoint:* POST /agents/search

   *  *Request Body:* json { "query": "I need an agent that can
      summarize long legal documents in Chinese.", "top": 5 }

   The registry returns a ranked list of agents whose descriptions match
   the query semantically, possibly including a match score.  For
   example:

Cui, et al.              Expires 16 August 2026                [Page 10]
Internet-Draft                    AIDIP                    February 2026

   json [ { "id": "agent-67890", "name": "Legal Document Summarizer",
   "description": "...", "score": 0.93 }, ... ]

   Semantic search enables flexible discovery beyond exact filter
   matching, aligning with how users or orchestrating agents might
   reason about tasks.  Registries not supporting semantic search simply
   ignore the query text and rely on provided filters.

4.3.3.  Retrieve Single Agent

   *  *Endpoint:* GET /agents/{id}

   *  *Response:* Full metadata JSON for the specified agent, or *404*
      if not found.

5.  Agent Invocation

   Once a client discovers a suitable agent, it invokes the agent by
   sending a request to the agent's endpoint.  This section defines the
   interface for invocation.

5.1.  Invocation Request

   To invoke an agent, the client sends an HTTP POST request to the
   agent's invocation endpoint with a JSON body containing the input
   data for the agent's task.

   *  *Method:* POST

   *  *URL:* The endpoint URL from the agent's metadata (e.g.,
      https://api.example.com/agents/translate).  If a gateway is used,
      the URL might be a gateway-provided path.

   *  *Headers:*

      -  Content-Type: application/json

      -  Authentication header as required (e.g., Authorization: Bearer
         <token> or X-API-Key: <key>).

   *  *Body:* A JSON object containing input data as per the agent's
      input schema.

   For example, invoking the translation agent:

   json { "text": "Hello, how are you?", "source_language": "en",
   "target_language": "fr" }

Cui, et al.              Expires 16 August 2026                [Page 11]
Internet-Draft                    AIDIP                    February 2026

   This corresponds to the agent's expected input fields.  If the agent
   had multiple operations and a unified endpoint, there might be an
   additional field to specify which operation or capability to use.
   For instance, the JSON could include something like "operation":
   "translateText" if needed.  Alternatively, different operations could
   be exposed at different URLs (e.g., /agents/xyz/translate vs
   /agents/xyz/summarize), in which case the operation is selected by
   the URL and no extra field is required.

   The protocol does not fix a specific parameter naming; it defers to
   the agent's published schema.  The only requirement is that the
   client's JSON must conform to what the agent expects.  For
   interoperability, using clear field names and standard data types
   (strings, numbers, booleans, or structured objects) is encouraged.
   Binary data (like images for an image-processing agent) should be
   handled carefully: typically, binary inputs can be provided either as
   URLs (pointing to where the data is stored), or as base64-encoded
   strings within the JSON, or by using a multipart request.  This
   specification suggests that if agents need to receive large binary
   payloads, they either use URL references or out-of-scope mechanisms
   (like a separate upload and then an ID in the JSON).  The core
   invocation remains JSON-based for simplicity and consistency.

   *Headers:* If authentication is required (see Security section), the
   client must also include the appropriate headers (e.g.,
   Authorization: Bearer <token> or an API key header) as dictated by
   the agent's metadata.  The invocation request may also include
   optional headers for correlation or debugging, such as a request ID,
   but those are not standardized here.

5.2.  Invocation Response

   The agent (or gateway) will process the request and return a
   response.  The status code and JSON body of the response follow these
   guidelines:

   *  *Success (2xx status):* If the agent successfully performed its
      task and produced a result, the status SHOULD be *200 OK* (or *201
      Created* if a new resource was created as a result, though usually
      for these actions 200 is fine).  The response body will contain
      the output data in JSON.  Ideally, the output JSON conforms to the
      agent's advertised output schema.

      For example, for the translation request above, a success response
      might be:

      json { "translated_text": "Bonjour, comment êtes-vous?" }

Cui, et al.              Expires 16 August 2026                [Page 12]
Internet-Draft                    AIDIP                    February 2026

      Here the JSON structure matches what was described in the agent
      metadata's outputs.  If the output is complex (e.g., multiple
      fields or nested objects), those should appear accordingly.  The
      response can include other informational fields if necessary (for
      example, some agents might return usage metrics, like tokens used
      or time taken, or a trace id for debugging, but these are optional
      and out of scope of the core spec).

   *  *Client Error (4xx status):* If the request was malformed or
      invalid, the agent returns a *4xx* status code.  The most common
      would be *400 Bad Request* for a JSON that doesn't conform to the
      expected schema or missing required fields.  For example, if the
      client omitted a required field target_language, the agent might
      respond with 400.  The response body SHOULD include an error
      object explaining what went wrong.  We define a simple standard
      for error objects:

      json { "error": { "code": "InvalidInput", "message": "Required
      field 'target_language' is missing." } }

      Here, "code" is a short string identifier for the error type
      (e.g., InvalidInput, Unauthorized, NotFound), and "message" is a
      human-readable description.  The agent can include additional
      details if available (e.g., a field name that is wrong, etc.).  If
      the error is due to unauthorized access, *401 Unauthorized* or
      *403 Forbidden* should be used (with an appropriate error message
      indicating credentials are missing or insufficient).  If the agent
      ID is not found (perhaps the client used an outdated reference),
      *404 Not Found* is appropriate.

   *  *Server/Agent Error (5xx status):* If something goes wrong on the
      agent's side during processing (an exception, a timeout while
      executing the task, etc.), the agent (or gateway) returns a *5xx*
      status (most likely *500 Internal Server Error* or *502/504* if
      there are upstream issues).  The response should again include an
      error object.  For example:

      json { "error": { "code": "AgentError", "message": "The agent
      encountered an unexpected error while processing the request." } }

      The agent might log the detailed error internally, but only convey
      a generic message to the client for security.  A *503 Service
      Unavailable* might be returned if the agent is temporarily
      overloaded or offline, indicating the client could retry later.

   *  *Status Codes Summary:* In short, this protocol expects the use of
      standard status codes to reflect outcome (200 for success, 4xx for
      client-side issues, 5xx for server-side issues).  Agents should

Cui, et al.              Expires 16 August 2026                [Page 13]
Internet-Draft                    AIDIP                    February 2026

      avoid using 2xx if the operation did not semantically succeed
      (even if technically a response was generated).  For example, if
      an agent is a composite that calls other services and one of those
      calls fails, it should propagate an error rather than returning
      200 with an error in the data.

5.3.  Additional Considerations for Invocation

   *  *Streaming Responses:* Some agents (especially those wrapping
      large language models) may produce results that are streamed (for
      example, token-by-token outputs).  While this base protocol
      assumes a request-response pattern with the full result delivered
      at once, it can be extended to support streaming by using chunked
      responses or WebSockets.  For instance, an agent might accept a
      parameter like stream: true and then send partial outputs as they
      become available.  This is an advanced use case and not elaborated
      in this draft, but implementers should consider compatibility with
      streaming if real-time responsiveness is needed.

   *  *Batch Requests:* If a client wants to send multiple independent
      requests to an agent in one go (for efficiency), the protocol can
      support that by allowing an array of input objects in the POST
      body instead of a single object.  The response would then be an
      array of output results.  This is optional and depends on agent
      support.

   *  *Idempotency and Retries:* Most agent invocations are not strictly
      idempotent (since an agent might perform an action or have side
      effects), but many are pure functions (e.g., translate text).
      Clients and gateways should design with retry logic carefully — if
      a network failure happens, a retry might re-run an operation.
      It's best to ensure that agents' operations are either idempotent
      or have safeguards (for example, an operation that sends an email
      might have an idempotency key).

   *  *Operation Metadata:* In cases where the agent defines multiple
      operations in its metadata, the invocation interface might allow a
      generic endpoint that accepts an operation name.  Alternatively,
      each operation could be a sub-resource.  This draft leaves the
      exact mechanism flexible: an implementation could choose one of
      these approaches.  The key is that the invocation uses POST and a
      JSON body following the agent's schema.

Cui, et al.              Expires 16 August 2026                [Page 14]
Internet-Draft                    AIDIP                    February 2026

6.  Agent Semantic Resolution

   Agent Semantic Resolution (ASR) is an optional extension to the
   discovery mechanism defined in this document.  ASR enables a client
   to resolve a task intent into one or more candidate agents prior to
   invoking any specific agent.

   ASR operates on the following conceptual model:

   (Intent, Context, Policy) → (Agent Endpoint(s), Invocation Metadata)

   The intent represents the task to be performed, while context and
   policy may include domain constraints, trust requirements, or
   performance considerations.

6.1.  Non-Goals

   ASR explicitly does not provide:

   *  Name-to-address resolution

   *  Global or persistent agent identifiers

   *  Replacement for DNS, ANS, or URI-based registries

   ASR answers the question "Which agent(s) should handle this task
   now?" rather than "Where is agent X located?".

7.  Semantic Routing Platform

   A Semantic Routing Platform (SRP) is a control-plane service that
   implements ASR.  An SRP assists a Host Agent in selecting appropriate
   agents before standard discovery and invocation procedures are used.

   An SRP MAY perform semantic matching, ranking, and policy-based
   filtering of candidate agents.  The SRP does not participate in task
   execution and does not alter the invocation semantics defined in this
   document.

   Interaction with an SRP is OPTIONAL.  Clients that do not support ASR
   continue to operate using the discovery and invocation mechanisms
   defined in earlier sections.

Cui, et al.              Expires 16 August 2026                [Page 15]
Internet-Draft                    AIDIP                    February 2026

8.  Backward Compatibility

   All discovery and invocation mechanisms defined in previous revisions
   of this document remain valid and unchanged.

   Agent Semantic Resolution is an optional extension.  Implementations
   MAY support ASR incrementally, and registries MAY provide semantic
   resolution capabilities without affecting existing clients.

9.  Security Considerations

   Security is a critical aspect of this protocol.  All discovery and
   invocation traffic MUST be protected with TLS [RFC8446], and
   authentication mechanisms such as OAuth 2.0 [RFC6749] bearer tokens,
   API keys, or mutual TLS are required except for public discovery
   endpoints.  Registries MUST enforce per-client entitlements, ensuring
   that both search results and invocation access respect permissions
   and scopes.  Gateways forwarding requests should authenticate
   themselves to agents, and agents should maintain stable identifiers
   and use signed responses when integrity is essential.  All
   communication MUST be encrypted, and agents are encouraged to
   disclose data-retention or logging practices, while sensitive data is
   best handled by on-premises or certified agents.  To mitigate abuse,
   registries and agents MUST implement rate limiting and quotas,
   particularly in semantic search scenarios.  Trust mechanisms such as
   certification, test harnesses, or reputation systems may be used to
   validate agent claims, and metadata fields like "certification" or
   "quality_score" can inform client trust decisions.  Systems SHOULD
   also provide audit and logging with privacy-aware retention, while
   clients must treat agent outputs as untrusted until verified, using
   sandboxing and validation before executing code or commands.

   When Agent Semantic Resolution is used, security considerations
   extend to the pre-invocation phase.  Resolution services SHOULD
   validate agent capability claims, apply policy constraints, and
   exclude agents that do not meet trust or reputation requirements.
   Agents deemed unsafe SHOULD NOT be returned as resolution candidates.

10.  Example Interaction Flow

   1.  *Search:* Client POST /agents/search with {"query":"summarize an
       English document","filters":{"capabilities":["summarization"],"su
       pported_language":"en"},"top":3}.

   2.  *Select:* Registry returns candidate agents with id, name,
       description, and score.  Client retrieves full metadata via GET
       /agents/{id} if needed.

Cui, et al.              Expires 16 August 2026                [Page 16]
Internet-Draft                    AIDIP                    February 2026

   3.  *Invoke:* Client POST to the agent's endpoint (or gateway path)
       with inputs conforming to agent schema and required auth header.

   4.  *Handle Response:* Client processes success or error response;
       may log usage and optionally rate/feedback the agent.

11.  IANA Considerations

   This document has no IANA actions.

12.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

   [RFC6749]  Hardt, D., Ed., "The OAuth 2.0 Authorization Framework",
              RFC 6749, DOI 10.17487/RFC6749, October 2012,
              <https://www.rfc-editor.org/rfc/rfc6749>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

   [RFC8259]  Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
              Interchange Format", STD 90, RFC 8259,
              DOI 10.17487/RFC8259, December 2017,
              <https://www.rfc-editor.org/rfc/rfc8259>.

   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
              <https://www.rfc-editor.org/rfc/rfc8446>.

   [RFC9110]  Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
              Ed., "HTTP Semantics", STD 97, RFC 9110,
              DOI 10.17487/RFC9110, June 2022,
              <https://www.rfc-editor.org/rfc/rfc9110>.

Acknowledgments

   TODO acknowledge.

Authors' Addresses

Cui, et al.              Expires 16 August 2026                [Page 17]
Internet-Draft                    AIDIP                    February 2026

   Yong Cui
   Tsinghua University
   Beijing, 100084
   China
   Email: cuiyong@tsinghua.edu.cn
   URI:   http://www.cuiyong.net/

   Yihan Chao
   Zhongguancun Laboratory
   Beijing, 100094
   China
   Email: chaoyh@zgclab.edu.cn

   Chenguang Du
   Zhongguancun Laboratory
   Beijing, 100094
   China
   Email: ducg@zgclab.edu.cn

Cui, et al.              Expires 16 August 2026                [Page 18]