AI Network for Training, Inference, and Agentic Interactions
draft-akhavain-moussa-ai-network-00
This document is an Internet-Draft (I-D).
Anyone may submit an I-D to the IETF.
This I-D is not endorsed by the IETF and has no formal standing in the
IETF standards process.
The information below is for an old version of the document.
| Document | Type |
This is an older version of an Internet-Draft whose latest revision state is "Active".
|
|
|---|---|---|---|
| Authors | Arashmid Akhavain , Hesham Moussa | ||
| Last updated | 2025-07-22 | ||
| RFC stream | (None) | ||
| Formats | |||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | I-D Exists | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
draft-akhavain-moussa-ai-network-00
Network Working Group A. Akhavain
Internet-Draft H. Moussa
Intended status: Informational Huawei Canada
Expires: 23 January 2026 22 July 2025
AI Network for Training, Inference, and Agentic Interactions
draft-akhavain-moussa-ai-network-00
Abstract
Artificial Intelligence (AI) is rapidly reshaping industries and
daily life, driven by advances in large language models (LLMs) such
as ChatGPT, Claude, Grok, and DeepSeek. These models have
demonstrated the transformative potential of AI across diverse
applications, from productivity tools to complex decision-making
systems. However, the effectiveness and reliability of AI hinge on
two foundational processes: training and inference. Each presents
unique challenges related to data management, computation,
connectivity, privacy, trust, security, and governance. This
document introduces the Data Aware-Inference and Training Network
(DA-ITN)—a unified, intelligent, multi-plane network architecture
designed to address the full spectrum of AI system requirements. DA-
ITN provides a scalable and adaptive infrastructure that connects AI
clients, data providers, service facilitators, and computational
resources to support end-to-end AI lifecycle operations. The
architecture features dedicated control, data, and operations &
management (OAM) planes to orchestrate training and inference while
ensuring reliability, transparency, and accountability. By outlining
the key requirements of AI systems and demonstrating how DA-ITN
fulfills them, this document presents a vision for the future of AI-
native networking—an "AI internet"—optimized for continuous learning,
scalable deployment, and seamless agent-to-agent collaboration.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
Akhavain & Moussa Expires 23 January 2026 [Page 1]
Internet-Draft AI-Internet July 2025
This Internet-Draft will expire on 23 January 2026.
Copyright Notice
Copyright (c) 2025 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Training Requirements . . . . . . . . . . . . . . . . . . . . 4
2.1. Centralized versus Decentralized Training . . . . . . . . 4
2.2. Requirements Breakdown . . . . . . . . . . . . . . . . . 5
2.2.1. Data Collection/Model Dispatching . . . . . . . . . . 5
2.2.2. Data and Resource Discovery . . . . . . . . . . . . . 7
2.2.3. Mobility and Service Continuity Handling . . . . . . 9
2.2.4. Privacy, Trust, and Data Ownership and Utility . . . 10
2.2.5. Testing and Performance Management . . . . . . . . . 10
2.2.6. QoS Guarantee . . . . . . . . . . . . . . . . . . . . 10
2.2.7. Charging and Billing . . . . . . . . . . . . . . . . 11
3. Inference . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1. Requirement Breakdown . . . . . . . . . . . . . . . . . . 12
3.1.1. Model Deployment and Mobility . . . . . . . . . . . . 12
3.1.2. Model Discovery and Description . . . . . . . . . . . 14
3.1.3. Query and Inference Result Routing . . . . . . . . . 16
3.1.4. Inference Chaining/Collaborative Inference . . . . . 17
3.1.5. Compute and Resource Management . . . . . . . . . . . 19
3.1.6. Privacy Preservation and Security . . . . . . . . . . 19
3.1.7. Utility Handling and QoS Requirements . . . . . . . . 19
3.1.8. Model Upgrade Streamlining . . . . . . . . . . . . . 20
3.1.9. Charging and Billing . . . . . . . . . . . . . . . . 21
4. Data Aware Inference and Training Network (DA-ITN): General
Framework . . . . . . . . . . . . . . . . . . . . . . . . 21
4.1. Control plane and Intelligence Layer . . . . . . . . . . 22
4.2. Data Plane . . . . . . . . . . . . . . . . . . . . . . . 23
4.3. Operation and Management Plane (OAM) . . . . . . . . . . 23
4.4. Summary of the DA-ITN General Framework . . . . . . . . . 24
5. DA-ITN for Training . . . . . . . . . . . . . . . . . . . . . 24
6. DA-ITN for Inference . . . . . . . . . . . . . . . . . . . . 28
Akhavain & Moussa Expires 23 January 2026 [Page 2]
Internet-Draft AI-Internet July 2025
7. DA-ITN-Facilitation Agentic Networks . . . . . . . . . . . . 29
8. Security Considerations . . . . . . . . . . . . . . . . . . . 30
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30
10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 31
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 31
1. Introduction
AI has become a major focus in recent years, with its influence
rapidly expanding from everyday tasks like scheduling to complex
areas such as healthcare. This growth is largely driven by advances
in large language models (LLMs) like ChatGPT, Claude, Grok, and
DeepSeek, which are now widely used for tasks such as brainstorming,
editing, coding, and data analysis. These real-world applications
highlight AI’s transformative power to boost productivity and
simplify life. It’s clear that AI is not a passing trend but a
lasting and evolving force.
However, it is crucial to recognize that the success of AI systems
relies on two fundamental pillars: training and inference. Both of
these pillars have a number of factors and moving parts that need to
be carefully coordinated, designed, and managed to ensure accuracy,
resilience, usability, continuous evolution, trustworthiness, and
reliability. Moreover, once deployed, AI systems must be
continuously monitored and governed to safeguard user safety and
societal well-being.
As such, aspects such as data management, computational resources,
connectivity, security, privacy, trust, billing, and rigorous testing
are all crucial when handling AI systems. Thus, it is important to
clearly understand the requirements of the AI systems from both the
training and inference prospective as both of these pillars
constitute an entangled framework and cannot be tackled in isolation.
In this document, we present a vision of an ecosystem, especially
designed to satisfy the requirements of AI from training and
inference points of view. We propose a unified, intelligent network
architecture—the Data Aware-Inference and Training Network (DA-ITN).
This ecosystem is envisioned as a comprehensive, multi-plane network
with dedicated control, data, and operations & management (OAM)
planes. It is designed to interconnect all relevant stakeholders,
including clients, AI service providers, data providers, and third-
party facilitators. Its core objective is to provide the
infrastructure and coordination necessary to support an ecosystem for
enabling AI of the future at scale.
Akhavain & Moussa Expires 23 January 2026 [Page 3]
Internet-Draft AI-Internet July 2025
This document aims to introduce the DA-ITN vision and establish a
compelling case for its central role in enabling a new generation of
AI-native networks, i.e., AI internet. These networks will be
optimized not only for learning and inference but also for seamless
collaboration, interaction, and communication among AI agents. To
that end, we begin by outlining the specific requirements of AI from
both the training and inference standpoints. We then introduce the
core components of the DA-ITN and illustrate how they collectively
meet these requirements. Finally, this network is positioned as an
ecosystem for agent-to-agent collaborations, interactions, and
communications.
2. Training Requirements
AI model training is the foundational process through which an
artificial intelligence system learns to perform tasks by analyzing
data and adjusting its internal parameters—typically the weights in
neural networks—to minimize prediction errors. At its core, this
process involves feeding input data into a model, and applying
optimization algorithms to iteratively refine the model’s
performance. Among the most influential outcomes of this process are
foundation models, such as ChatGPT and its peers, which are capable
of performing a wide range of tasks across domains. Training these
models now occurs at an unprecedented scale, requiring massive
compute infrastructure, enormous amounts of training data, high-speed
interconnects, and parallelized training frameworks (e.g., data,
model, and pipeline parallelism).
2.1. Centralized versus Decentralized Training
It is clear from the above that no matter how advanced the model
architecture may be, the success of any training process ultimately
hinges on two fundamental components: the model and the data. While
the model itself is often developed and hosted in a centralized
location—typically within the secure infrastructure of the model
owner or designer—data is inherently distributed. It originates from
sensors, devices, logs, events, documents, and other diverse sources
spread across different geographies and domains. To be exact,
whether due to geographic dispersion, organizational silos, privacy
constraints, or edge-device generation, data rarely exists in a
single, clean repository.
Today, model training can happen in one of two ways or a combination
thereof: centralized or decentralized. In centralized training,
thanks to the development of robust data collection techniques and
high-throughput connectivity networks, it is now feasible to collect
data and bring it to where the model training would occur. This
traditional approach is often referred to as model-centralized
Akhavain & Moussa Expires 23 January 2026 [Page 4]
Internet-Draft AI-Internet July 2025
training. On the other hand, a more recent paradigm known as model-
follow-data has emerged, advocating for the reverse: rather than
transporting large volumes of potentially sensitive data to a central
location, the model is dispatched to where the data resides—enabling
distributed or federated training.
Accordingly, to facilitate the training process, rendezvous points
scheduling between distributed data, compute and storage resources,
and an AI model awaiting training needs to be arranged and managed,
which is fundamental for successful model training. However, this
scheduling process introduces a number of challenges spanning
privacy, trust, utility, and computational and connectivity resources
management. Moreover, as AI adoption accelerates, both centralized
and decentralized approaches will drive increasing pressure on
underlying connectivity infrastructure. Therefore, to ensure
scalable, efficient, and cost-effective AI training, it is vital to
implement intelligent mechanisms for managing data and model
movement, selecting relevant subsets for training, and minimizing
unnecessary transfers.
In the sections that follow, we explore the architectural and
operational requirements needed to support this vision and lay the
foundation for a high-performance, AI-native training ecosystem.
2.2. Requirements Breakdown
Consider a number of AI model training clients awaiting training
service. An AI model training client is a user with a raw or a pre-
trained model who wishes to train or continue training their AI model
using data that can be found in the data corpus. The data corpus
(the global dataset), as has been previously established, consists of
a group of datasets that are distributed across various geographical
locations. AI clients require access to this data either in a
centralized or distributed manner.
2.2.1. Data Collection/Model Dispatching
As previously discussed, data is inherently distributed. In
centralized training paradigms, this data must be transferred from
its sources to centralized locations where model training occurs.
Consider a scenario involving multiple clients, each awaiting
centralized training of AI models using distinct data sets of
interest. Aggregating large volumes of data from geographically
dispersed sources to centralized servers introduces several
significant challenges:
Akhavain & Moussa Expires 23 January 2026 [Page 5]
Internet-Draft AI-Internet July 2025
* Communication Overhead: The sheer volume of data to be transmitted
can place substantial strain on the underlying transport networks,
resulting in increased latency and bandwidth consumption.
* Redundant Knowledge Transfer: Despite originating from different
sources, data sets may carry overlapping or identical knowledge
content. Transmitting such redundant content leads to unnecessary
duplication, wasting resources without providing additional
training value.
* Timely Delivery: In certain applications, the freshness of data is
critical. Delays in transmission can degrade the value of the
information, as these applications are sensitive to the Age of
Information (AoI)—the time elapsed since data was last updated at
the destination.
* Multi-Modal Data Handling: Data often exists in various
formats—such as text, images, audio, video, etc—each with distinct
transmission requirements. Ensuring accurate and reliable
delivery of these diverse data types necessitates differentiated
Quality of Service (QoS) levels tailored to the characteristics
and sensitivity of each modality.
* Heterogeneous Access Media: Data may reside across diverse
communication infrastructures—for example, some data may be
accessible only via 3GPP mobile networks, while other data may be
confined to wireline networks. Coordinating data collection
across these heterogeneous domains, while maintaining
synchronization and consistency, presents a significant
operational challenge.
Importantly, many of these challenges are alleviated in decentralized
training frameworks, where data remains local to its source and is
not transferred over the network. Instead, the model itself is
distributed to the various data locations. However, this alternate
paradigm introduces its own set of unique challenges.
As previously noted, modern AI models are growing increasingly large
in size. In decentralized training, it is often necessary to
replicate the model and transmit it to multiple, geographically
dispersed data sites. This results in a different but equally
significant set of logistical and technical hurdles:
* Communication Overhead: While data transfer is avoided,
dispatching large model files across the network to multiple
destinations can still impose substantial load on communication
infrastructure, particularly in bandwidth-constrained
environments.
Akhavain & Moussa Expires 23 January 2026 [Page 6]
Internet-Draft AI-Internet July 2025
* Redundant Knowledge Transfer: Data residing at different locations
may share overlapping knowledge content. Sending models to
multiple sites with redundant knowledge content leads to
inefficient use of network resources. In some cases, even when
knowledge content is only partially redundant, it may be more
efficient—considering communication cost—to forego marginal
training benefits in favor of reduced overhead.
* Timeliness and Data Freshness: In certain applications, the Age of
Information (AoI) remains critical. Prioritizing model dispatch
to data sources with soon-to-expire or time-sensitive information
is essential to maximize the utility of training and to maintain
up-to-date model performance.
2.2.2. Data and Resource Discovery
Given the distributed nature of data, there must be a mechanism
through which data owners can advertise information about their
datasets to AI model training clients. This requires the ability to
describe the characteristics of the data—such as its knowledge
content, quality, size, and Age of Information (AoI)—in a way that
allows AI clients to discover and evaluate whether the data aligns
with their training objectives. Training objectives can be one or
more of: target performance, convergence time, training cost, etc.
Crucially, this discovery process may need to operate across multiple
network domains and heterogeneous communication infrastructures. For
example, an AI training client operating over a wireline connection
may be interested in data residing on a 3GPP mobile network. This
raises an important question: How can data owners effectively
advertise their datasets in a way that is discoverable across diverse
domains? To enable such cross-domain data visibility and discovery,
the following key requirements must be considered:
Akhavain & Moussa Expires 23 January 2026 [Page 7]
Internet-Draft AI-Internet July 2025
* Data Descriptors: These are metadata objects used by data owners
to reveal essential information about their datasets to AI
clients. Effective data descriptors must be self-contained,
privacy-preserving, and informative enough to support decision-
making by training clients. They should allow data owners to
selectively disclose details about their data—such as type,
relevance, quality metrics, freshness, and perhaps cost of
utility—while concealing sensitive or proprietary information
(privacy preservation). Data descriptors also need to be easily
modified as data can be dynamic, and the change in data needs to
be effectively reflected into the data descriptions. To ensure
interoperability, data descriptors can either follow a
standardized format or adopt a flexible but well-defined structure
that enables consistent interpretation across different systems
and domains.
* Data Discovery Mechanisms: These refer to the processes by which
AI training clients locate and identify datasets across
potentially vast and heterogeneous environments. An effective
discovery mechanism should support global-scale searchability and
cross-domain operability, allowing clients to find relevant
datasets regardless of where they reside or which communication
infrastructure they are accessible through. Discovery protocols
may be standardized within specific domains (e.g., mobile
networks, IoT platforms) or designed to function interoperable
across multiple domains, enabling seamless integration and
visibility. It should also be highlighted that, discovery
mechanisms should be considerably up-to-date with the changes that
would occur as the underlying data changes dynamically.
* Data Relationship Maps: Training often requires identifying groups
of datasets that collectively meet specific requirements.
Evaluating each dataset in isolation may be insufficient.
Instead, a mechanism is needed to establish relationships among
datasets, enabling AI training clients to assemble the appropriate
combination of data for their tasks. These relationships can be
envisioned to look like maps or topologies. This is a crucial
step as if an AI model client was not able to find the right
dataset that satisfies its requirements, the client might choose
not to submit the model for training at this time which may reduce
resource wastage from the get go.
* Timely reporting: Given the dynamic nature of data availability,
characteristics, and accessibility, it is essential to have
advertisement mechanisms that can promptly reflect any changes.
Real-time or near-real-time updates ensure that the AI training
process remains aligned with the most current data conditions,
thereby maximizing both effectiveness and accuracy. Timely
Akhavain & Moussa Expires 23 January 2026 [Page 8]
Internet-Draft AI-Internet July 2025
reporting helps prevent training on outdated or irrelevant data
and supports optimal decision-making in model selection and
training pipeline configuration.
Additionally, it should be highlighted that in AI training,
discovering data alone is not enough. For instance, third-party
resources like compute and storage are essential, and the providers
of those resources must be able to advertise their capabilities so AI
clients can locate and utilize them effectively. Just like with
data, resource discovery requires descriptors, multi-domain
accessibility, and timely updates to support seamless coordination
between models, data, and infrastructure. It should be highlighted
that data and resource discovery is essential in both centralized and
decentralized training, as both can be done on third party
infrastructure.
2.2.3. Mobility and Service Continuity Handling
In some decentralized training applications, AI models are designed
to traverse a predefined route, training on multiple datasets in a
sequential or federated manner. This introduces the need to manage
model mobility. However, the underlying data landscape is often
dynamic—new data is continuously generated, existing data may be
deleted, or datasets may be relocated to different nodes or domains.
As a result, enabling reliable model mobility in such a fluid
environment requires robust mobility management mechanisms. For
instance, while a model is en-route to a specific data location for
training, that dataset may be moved elsewhere. In such cases, the
model must either be re-routed to the new location or redirected to
an alternative dataset that satisfies similar training objectives.
Additionally, since training occurs on remote compute infrastructure
and can be time-intensive, unexpected resource shutdowns or failures
may interrupt the process. These interruptions can lead to service
discontinuity, which must be addressed through mechanisms such as
checkpointing, fallback resource selection, or dynamic rerouting of
model or data to maintain training progress and system reliability.
Additionally, model mobility may involve training on datasets that
are distributed across heterogeneous communication infrastructures.
Some infrastructures, such as emerging 6G networks, offer built-in
mobility support—for example, when data resides on mobile user
equipment (UE), its location can be tracked using native features of
the network. However, such mobility handling capabilities may not
exist in other infrastructures, such as traditional wireline networks
or legacy systems, making seamless model movement and data access
more challenging in those environments.
Akhavain & Moussa Expires 23 January 2026 [Page 9]
Internet-Draft AI-Internet July 2025
2.2.4. Privacy, Trust, and Data Ownership and Utility
Privacy and trust are mutual responsibilities—both data owners and
model owners must be protected. Granting clients access to data for
training and knowledge building should be a regulated process, with
mechanisms to track data ownership and future use. Initial
discussions on this topic have taken place in forums such as the AI-
Control Working Group.
Equally important is ensuring that model owners are protected from
data poisoning. They must have confidence that the datasets they use
are accurately described and not misrepresented. If data owners
provide false metadata—intentionally or otherwise—model owners may
unknowingly train on unsuitable or harmful datasets, leading to
degraded model performance. To safeguard both parties, innovative
verification and enforcement mechanisms are needed. Technologies
like blockchain could offer potential solutions for establishing
trust and accountability, but further research and exploration are
necessary to develop practical frameworks.
2.2.5. Testing and Performance Management
Another critical aspect of training is testing and performance
evaluation, typically carried out using a separate subset of the data
known as the testing dataset. This dataset is not used to update the
model’s weights but to assess its performance on unseen samples. In
centralized training, this process is straightforward because all
data resides in a single, accessible location, making it easy to
partition the dataset into training and testing subsets. However, in
distributed training environments, where data is spread across
multiple locations or devices, creating a representative and unbiased
testing dataset without aggregating the data centrally becomes a
major challenge. Developing effective, privacy-preserving methods
for testing in such settings requires innovative solutions
2.2.6. QoS Guarantee
Beyond ensuring traditional Quality of Service (QoS) for data
transmission, a new dimension of QoS must be considered—the QoS of
training itself. In AI training workflows, it is crucial to
guarantee that key performance indicators (KPIs) related to training,
such as accuracy convergence, training time, and resource
utilization, are met consistently. This raises several important
questions: * How can these training KPIs be guaranteed in dynamic or
distributed environments?
* What mechanisms can be used to monitor and track training
performance in real time?
Akhavain & Moussa Expires 23 January 2026 [Page 10]
Internet-Draft AI-Internet July 2025
* Should AI training be treated like best-effort traffic, where no
guarantees are made and resources are allocated as available?
* Or should training tasks receive prioritized or differentiated
service levels, similar to high-priority traffic in traditional
networks?
Addressing these questions is essential to ensure predictable and
reliable AI model development, especially as training workloads grow
in complexity and scale. It may require introducing new QoS
frameworks tailored specifically to the needs of AI training systems.
2.2.7. Charging and Billing
The AI training process involves a diverse ecosystem of stakeholders,
including data owners, model owners, and resource providers. Each of
these parties plays one or more vital roles in enabling successful
training workflows.
For example, communication providers contribute not only by
transporting data and models across the network but also they
themselves may also serve as data providers. This is particularly
evident in the emerging design of 6G networks, which integrate
sensing capabilities with communication infrastructure. As a result,
6G operators are uniquely positioned to offer both connectivity and
data, making them central players in the training pipeline.
Despite their different roles, all parties contribute to enabling AI
training as a service, a complex and resource-intensive process that
is far from free. Therefore, it is essential to establish a robust
charging and billing framework that ensures each participant is
fairly compensated based on their contribution.
Several open questions arise in this context:
* Should training services follow a prepaid model, or adopt a pay-
per-use structure?
* Will there be tiered service offerings, such as gold, silver, and
platinum, each providing different levels of performance
guarantees or priority access?
* How should these tiers be defined and enforced in terms of service
quality, resource allocation, and response time?
Developing fair, transparent, and scalable billing mechanisms is
critical to facilitating collaboration across stakeholders and
sustaining the economic viability of distributed AI training
Akhavain & Moussa Expires 23 January 2026 [Page 11]
Internet-Draft AI-Internet July 2025
ecosystems. These challenges call for further research into
incentive structures, dynamic pricing models, and smart contract-
based enforcement, especially in scenarios involving cross-
organizational or cross-network cooperation.
3. Inference
Inference is critical because it represents the phase where the model
begins to deliver practical value. Unlike training, which is
typically a one-time or periodic, resource-intensive process,
inference often needs to operate continuously and efficiently,
sometimes in real-time. Although inference is a less resource-
intensive process, it has strict requirements that govern its
success. In what follows, we explore these requirements that shall
enable a successful AI inference ecosystem.
3.1. Requirement Breakdown
We envision an inference ecosystem composed of a large number of pre-
trained AI models (or agents) distributed across the globe. These
models are capable of performing a wide range of tasks, such as image
classification, language translation, or speech recognition. Some
models may specialize in the same task but vary in performance,
accuracy, latency, or resource demands. This diverse pool of models
is accessed by numerous inference clients (users or applications) who
submit inputs, referred to as queries, and receive task-specific
outputs.
These queries can vary greatly in complexity, structure, and
modality, with some requiring the cooperation of multiple models to
fulfill a single request. The overarching goal of the ecosystem is
to efficiently match incoming queries with the most suitable models,
ensuring accurate, timely, and resource-aware responses. Achieving
this requires intelligent orchestration, load balancing, and
potentially dynamic model selection based on factors such as
performance, availability, cost, and user-specific requirements. In
what follows, we discuss the various aspects of this ecosystem and
discuss the different requirements needed for its success.
3.1.1. Model Deployment and Mobility
The first step toward building a successful AI inference ecosystem is
the optimal deployment of trained models, or agents. In this
context, optimality refers to both the physical or network location
of the model and the manner in which it is deployed. AI models vary
significantly in size and resource requirements—ranging from
lightweight models that are only a few kilobytes to large-scale
models with billions of parameters. This wide range makes deployment
Akhavain & Moussa Expires 23 January 2026 [Page 12]
Internet-Draft AI-Internet July 2025
decisions critical to achieving both efficient performance and
effective resource utilization. Also, a unique factor to AI models/
agents is the fact that they are software components that are not
bounded to a certain hardware. They can be deleted, copied, moved,
or split across multiple compute locations. All these unique aspects
provide flexibility in design if the real-time status of the
underlying network dynamics and resources is made accessible.
* Choosing the right facility to host a model: whether it's a
lightweight edge device, a local server, or a high-performance
cloud data center, deployment will depend on the model's size,
computational requirements, and expected query volume. For
example, smaller models might be best suited for deployment on
edge devices closer to users, enabling low-latency responses. In
contrast, larger models may require centralized or specialized
infrastructure with high compute and memory capacity.
* Load balancing: Once models are deployed, inference traffic begins
to flow, with users or applications sending queries to the
appropriate agents. If not managed properly, this traffic can
lead to congestion, creating bottlenecks that degrade inference
performance through increased latency or dropped requests. To
avoid such scenarios, models should be deployed strategically to
distribute the load, ensuring smooth operation. Traditional load
balancing techniques can be employed to redirect traffic away from
overburdened nodes and towards underutilized ones. However, more
sophisticated strategies may involve replicating models and
placing these replicas closer to regions with high query demand,
thereby minimizing latency and easing network traffic engineering
challenges.
* Mobility-aware deployment: the dynamic nature of inference traffic
necessitates mobility-aware deployment. For instance, consider a
large data center acting as a centralized inference hub, hosting
numerous models and handling a significant volume of queries.
Over time, this hub may experience traffic overload. In such
cases, migrating certain models to alternative locations can help
alleviate pressure. However, model migration is not without its
challenges—particularly if a model is actively serving queries at
the time of migration. In such situations, mobility handling
mechanisms must be in place to ensure seamless service continuity.
These mechanisms could involve session handovers, temporary state
preservation, or model version synchronization, all designed to
maintain uninterrupted service during the migration process.
Akhavain & Moussa Expires 23 January 2026 [Page 13]
Internet-Draft AI-Internet July 2025
In summary, optimal model deployment requires careful consideration
of model size, resource needs, query distribution, and real-time
adaptability. Achieving this lays the foundation for a responsive,
scalable, and resilient AI inference ecosystem.
3.1.2. Model Discovery and Description
Just as data descriptors and discovery mechanisms are essential
during the training phase, AI model inference clients also require a
robust discovery mechanism during the inference stage. In an
ecosystem populated by a large and diverse pool of models—each with
unique capabilities and specializations—clients are presented with
significant flexibility and choice in selecting the most suitable
models for their queries. However, to make informed decisions,
clients must have access to information that enables them to
distinguish between models based on criteria such as performance,
specialization, availability, and resource requirements.
This discovery process becomes even more complex when it needs to
function across multiple network domains and heterogeneous
communication infrastructures. For instance, a client connected via
a wireline network might need to interact with a model deployed on a
mobile 3GPP network. Such scenarios raise a critical question: How
can model owners advertise their models in a way that ensures
discoverability and interoperability across diverse domains?
Addressing this challenge requires the development of standardized
model advertisement and discovery protocols that can operate
seamlessly across infrastructure boundaries. These protocols must
accommodate differences in network technology, latency constraints,
and security requirements while providing consistent and reliable
access to model information. Ensuring cross-domain discoverability
is crucial to unlocking the full potential of a globally distributed
inference ecosystem.
To enable such cross-domain model visibility and discovery, the
following key requirements must be considered:
Akhavain & Moussa Expires 23 January 2026 [Page 14]
Internet-Draft AI-Internet July 2025
* Model Descriptors: These are metadata objects used by model owners
to reveal essential aspects about their datasets to AI inference
clients. Effective data descriptors must be self-contained,
privacy-preserving, and informative enough to support decision-
making by inference clients. They should allow model owners to
selectively disclose details about their model—such as skills,
performance reviews, trust level, relevance, quality metrics,
freshness, and perhaps cost of utility—while concealing sensitive
or proprietary information. To ensure interoperability, model
descriptors can either follow a standardized format or adopt a
flexible but well-defined structure that enables consistent
interpretation across different systems and domains.
* Model/agent Discovery Mechanisms: These refer to the processes by
which AI inference clients locate and identify models/agents
across potentially vast and heterogeneous environments. An
effective discovery mechanism should support global-scale
searchability and cross-domain operability, allowing clients to
find relevant model/agents regardless of where they reside or
which communication infrastructure they are accessible through.
Discovery protocols may be standardized within specific domains
(e.g., mobile networks, IoT platforms) or designed to function
interoperable across multiple domains, enabling seamless
integration and visibility.
* Model/agent relationship maps: As queries may requiring the
collaboration between multiple models/agents, relationships
between models/agents with respect to different task might present
useful tools as to help clients choose the appropriate subset of
models/agents that would handle their queries.
* Timely Reporting: Similar to data, the status of a model can
change over time—for example, due to shifts in workload or
resource availability. It is important that such changes are
reported promptly and accurately, allowing clients to make
informed decisions based on the model’s current state. This is
essential for ensuring efficient model selection and maintaining
high-quality, reliable inference outcomes.
It is important to emphasize that model discovery differs
fundamentally from data discovery. While data are passive objects
that require external querying or manipulation, models are
intelligent, autonomous agents capable of making decisions based on
their own capabilities, status, and context. This distinction opens
up new and more dynamic possibilities for how models are discovered
and engaged in an inference ecosystem.
Akhavain & Moussa Expires 23 January 2026 [Page 15]
Internet-Draft AI-Internet July 2025
In traditional data discovery, clients search for and retrieve
relevant datasets based on metadata or predefined criteria. However,
in the case of model discovery, the process can be much more
interactive and flexible. One approach involves the client actively
discovering models by querying a directory or registry using model
descriptors. Based on these descriptors, the client selects one or
more models to handle a specific inference task. However, given that
models can reason and act independently, model discovery does not
have to be limited to client-driven selection. An alternative
approach is to reverse the flow of interaction. Instead of clients
seeking out models, they can publish their tasks to a shared task
pool, accessible to all available models. These tasks include
descriptors that define the type of work to be done, expected
outputs, and quality-of-service requirements. Models can then
autonomously scan this pool, evaluate whether they are well-suited
for specific tasks, and choose to express interest in executing them.
This self-selection process allows models to play an active role in
task matching, improving system scalability and efficiency.
The final assignment of a task can be handled in different ways.
Clients may retain full control and approve or reject interested
models based on their preferences or priorities. Alternatively, the
system may operate in a fully autonomous mode, where tasks are
assigned automatically to the first or best-matching model, without
requiring client intervention—depending on the client's chosen
policy.
This agent-driven paradigm reflects the shift toward more
decentralized and intelligent AI ecosystems, where models are not
merely passive computation endpoints but active participants in task
negotiation and resource allocation. Such a system not only enhances
scalability and flexibility but also allows for more efficient
utilization of the available model pool, especially in heterogeneous
and dynamic environments.
3.1.3. Query and Inference Result Routing
A significant challenge in AI inference networks lies in efficiently
routing client queries to the appropriate inference models and
ensuring the corresponding results are reliably delivered back to the
client. This becomes particularly complex in scenarios involving
mobility and multi-domain environments, where both the client and the
model may exist across different types of network infrastructures.
The key challenges and considerations include:
* Query Routing Across Heterogeneous Networks: When a client
accesses the inference ecosystem through a mobile network such as
3GPP 6G, and the target model is hosted in a wireline or cloud-
Akhavain & Moussa Expires 23 January 2026 [Page 16]
Internet-Draft AI-Internet July 2025
based infrastructure, routing the query across these distinct
domains is non-trivial. Differences in network architecture,
protocols, and service guarantees complicate the end-to-end flow.
* Mobility Management During Inference Execution: While mobile
networks like 6G are designed to handle user mobility, inference
tasks may take time to process—particularly when using large
models or performing complex computations. During this time, the
client may change physical location, switch devices, or even go
offline. Ensuring that inference results can still reach the
client under these dynamic conditions poses a significant
challenge.
* Handling Client State Changes: If a client becomes idle or
disconnects entirely during inference, the system must decide what
to do with the completed result. Should it be queued, buffered,
forwarded to another linked device, or simply discarded? A robust
mechanism is needed to track client state, maintain context, and
guarantee result delivery or at least graceful degradation.
* Support for Live and Streaming Inference: Some use cases, such as
real-time audio transcription, involve live streaming of data from
the client to the model and vice versa. These sessions require
sustained, low-latency connections and are particularly sensitive
to interruptions caused by mobility or handoffs between networks.
Ensuring session continuity and maintaining streaming quality
across network boundaries is a complex but critical aspect of
real-world inference deployments.
* Cross-Domain Connectivity and Session Management: The involvement
of multiple network operators and domains introduces questions
around interoperability, session tracking, and handover
coordination. There is a need for intelligent infrastructure
capable of end-to-end session management, including maintaining
metadata, context, and service quality as the session traverses’
different networks.
3.1.4. Inference Chaining/Collaborative Inference
Another critical aspect of an AI inference ecosystem is the need for
model collaboration to fulfill complex or multi-faceted tasks. Not
all inference requests can be handled by a single model; in many
cases, collaboration between multiple models is necessary.
Effectively managing this task-based collaboration is essential to
ensure accurate, efficient, and scalable inference services. Model
collaboration can take several distinct forms:
Akhavain & Moussa Expires 23 January 2026 [Page 17]
Internet-Draft AI-Internet July 2025
* Inference Chaining: In this model, the output of one model serves
as the input to the next in a sequential pipeline. Each model
performs a specific stage of the task, and the final
result—produced by the last model in the chain—is returned to the
client. This is common in multi-stage tasks such as image
processing followed by object detection and then classification.
* Parallel Inference: Here, a complex task is decomposed into
multiple subtasks, each of which is assigned to a specialized
model. These models operate concurrently, and their outputs are
aggregated to form a unified inference result. This approach is
particularly useful when dealing with large data sets or when a
task spans different domains of expertise.
* Hierarchical inference: A model is assigned as a task manager and
is responsible for delegating tasks to service models
* Collaborative Inference: In this more dynamic and decentralized
form, the task is assigned to a group of models that are capable
of discovering one another, assessing their respective
capabilities, and coordinating among themselves to devise a shared
strategy for completing the task. This model requires more
sophisticated communication, negotiation, and orchestration
mechanisms.
Regardless of the collaboration format, the success of such multi-
model interactions depends on the availability of a robust management
infrastructure. This infrastructure must enable seamless
coordination between models, even when:
* The models are hosted by different providers,
* They are deployed across heterogeneous communication networks,
* They use varying protocols, or
* They have differing performance characteristics.
Such a management system must abstract away the underlying
complexities and provide standardized interfaces, discovery
mechanisms, communication protocols, and coordination frameworks that
allow models to interact effectively. Without this, collaborative
inference would be brittle, inefficient, or impossible to scale. In
essence, the ability to orchestrate model collaboration across
diverse environments is a cornerstone of a flexible, intelligent, and
robust AI inference ecosystem.
Akhavain & Moussa Expires 23 January 2026 [Page 18]
Internet-Draft AI-Internet July 2025
3.1.5. Compute and Resource Management
In many scenarios, the compute infrastructure used to host and run
inference models is managed by third-party providers, not the model
owners themselves. These compute providers are responsible for
meeting the Quality of Service (QoS) levels agreed upon with the
model owners—such as latency, uptime, throughput, and reliability.
* Ensuring these service levels are consistently met raises the
question of accountability. If performance degrades due to
compute resource issues—such as overloaded hardware or network
outages—who is responsible for the failed inference tasks?
* There must be clear, enforceable service-level agreements (SLAs)
that define roles, responsibilities, and penalties for non-
compliance.
* Mechanisms for performance monitoring, auditing, and dispute
resolution need to be integrated into the ecosystem to make such
arrangements viable and trustworthy.
3.1.6. Privacy Preservation and Security
While models are the intellectual property of their owners, they may
operate on infrastructure owned by others. This raises significant
concerns around privacy and intellectual property protection.
* Sensitive model details such as architecture, weights, and
optimization strategies must be protected from exposure or reverse
engineering by untrusted compute hosts.
* Techniques such as secure computing, encrypted model execution,
and remote attestation protocols may be necessary to ensure that
models run securely without revealing proprietary details.
* Model owners must also be assured that inference inputs and
outputs remain confidential, particularly in applications
involving personal or sensitive data.
3.1.7. Utility Handling and QoS Requirements
Utility handling refers to the regulation, protection, and fair
governance of how models are used, accessed, and monitored throughout
the ecosystem. This encompasses several critical questions:
* How can we guarantee that a model deployed on remote
infrastructure is not being tampered with, copied, or
intentionally repurposed?
Akhavain & Moussa Expires 23 January 2026 [Page 19]
Internet-Draft AI-Internet July 2025
* How do we ensure that workload distribution is fair across
available models, preventing monopolization by a few and giving
equal visibility and opportunity to all participating models?
* What protections are in place to ensure that models are not being
poisoned, exploited, or involved in illegal activities, either
through malicious inputs or untrusted outputs?
* How do we ensure the integrity of inference results, so that
outputs are delivered to clients without alteration, manipulation,
or censorship? Addressing these concerns may require digital
rights management (DRM) for AI models, usage monitoring tools, and
potentially blockchain-based logging or audit trails to ensure
transparency and traceability.
On the other hand, the definition of Quality of Service (QoS), when
it comes to inference tasks, is very broad and can take many forms.
For instance, QoS could be to guarantee a certain accuracy of a
response, or time of the response, or expertise level needed. We
believe that the topic of QoS guarantee requires extensive studying
and analysis.
3.1.8. Model Upgrade Streamlining
AI models are not static; they undergo continuous upgrades,
improvements, and fine-tuning to maintain accuracy, adapt to new
data, or support evolving tasks.
* The ecosystem must support seamless model versioning, including
adding, removing, or modifying model agents without disrupting
ongoing services.
* Updated model profiles must be instantly reflected in the
discovery layer, ensuring clients always have access to the most
current and accurate model descriptions.
* For large models, upgrade procedures must be efficient and
bandwidth-conscious, potentially using incremental update
techniques to avoid full redeployment.
* Moreover, strategies must be in place to handle hot-swapping of
models, where an old model is gracefully decommissioned and
replaced by a new one—without causing inference failures or data
loss during the transition.
Akhavain & Moussa Expires 23 January 2026 [Page 20]
Internet-Draft AI-Internet July 2025
3.1.9. Charging and Billing
The AI inference process involves a diverse ecosystem of
stakeholders, including model owners, compute providers, and
communication providers. Each of these parties plays one or more
vital roles in enabling successful inference workflows. Therefore,
it is essential to establish a robust charging and billing framework
that ensures each participant is fairly compensated based on their
contribution.
Several open questions arise in this context:
* Should inference services follow a prepaid model, or adopt a pay-
per-use structure?
* Will there be tiered service offerings—such as gold, silver, and
platinum—each providing different levels of performance guarantees
or priority access?
* How should these tiers be defined and enforced in terms of service
quality, resource allocation, and response time?
* What about discovery framework providers? Would they be offering
a free service like google search or would it be more structured?
Developing fair, transparent, and scalable billing mechanisms is
critical to fostering collaboration across stakeholders and
sustaining the economic viability of distributed AI training
ecosystems. These challenges call for further research into
incentive structures, dynamic pricing models, and smart contract-
based enforcement, especially in scenarios involving cross-
organizational or cross-network cooperation.
4. Data Aware Inference and Training Network (DA-ITN): General
Framework
The DA-ITN is envisioned as a multi-domain, multi-technology network
operating at the AI layer, designed to address the various layers of
complexity inherent in modern AI ecosystems. It aims to support a
wide range of requirements across AI training, inference, and agent-
to-agent interaction, as previously outlined. To manage these
complexities and cater for the requirements, we propose structuring
the DA-ITN around four core components: a Control Plane (CP), a Data
Plane (DP), an Operations and Management (OAM) Plane, and an
Intelligence Layer. It is important to note that the DA-ITN is
agnostic to the underlying communication infrastructure, allowing it
to operate seamlessly over heterogeneous networks, whether mobile,
wireline, or satellite-based. he DA-ITN integrates with these
Akhavain & Moussa Expires 23 January 2026 [Page 21]
Internet-Draft AI-Internet July 2025
underlying infrastructures through any available means, embedding its
control and intelligence capabilities to coordinate and manage AI-
specific services in a flexible and scalable manner.
4.1. Control plane and Intelligence Layer
The Control Plane and Intelligence Layer work together to enable an
efficient, reliable, and timely information collection
infrastructure. They continuously gather up-to-date information on
data availability, model status, agent conditions, resource
utilization, and reachability across all participating entities. The
collected information comes in the form of dynamic descriptors for
data, models, and resources, essential components for enabling
intelligent, context-aware decision-making within the AI ecosystem as
has previously been highlighted. Also, with the help of data,
resource, and reachability topology engine (DRRT) housed within the
intelligence layer, the gathered information and descriptors can be
used to construct meaningful relationships across the ecosystem.
These are captured in the form of dynamic topologies or map-like
structures, which help optimize decision-making processes across
training, inference, and agent-to-agent collaboration tasks. This
design provides a continuous awareness that is very essential for the
success, reliability, accuracy, and responsiveness of the AI
functionalities and services enabled by the DA-ITN within the AI
ecosystem.
The DA-ITN control plane also lays a foundation for an advanced
discovery infrastructure where the generated descriptors can be made
easily accessible to all authorized participants to facilitate their
required AI service For example, AI clients subscribed to training
services can access up-to-date data descriptors and resource
topologies, enabling them to select appropriate datasets and compute
resources that align with their performance and accuracy goals.
Similarly, inference clients or agents seeking collaboration can
discover models based on capabilities, or submit task descriptors
that enable models to respond intelligently and autonomously.
Aside from descriptor collection, topology creation, and discovery,
the DA-ITN control plane also supports a secure and trusted
environment where clients, data providers, model providers, and
resource providers can engage in AI processes without compromising
integrity or accountability. It also plays a key role in managing
charging, billing, and rights enforcement, ensuring that all
contributors to the AI service chain are fairly compensated and
protected.
Akhavain & Moussa Expires 23 January 2026 [Page 22]
Internet-Draft AI-Internet July 2025
It is worth noting that the DA-ITN’s Control Plane is not constrained
by specific protocol stacks. Instead, it provides a flexible
connectivity and coordination infrastructure upon which various AI-
related protocols—such as Agent-to-Agent (A2A), Model Control
Protocol (MCP), or AI Coordination Protocol (ACP)—can operate.
Regardless of the protocol used, implementations must meet the core
DA-ITN requirements, including timely information exchange, flexible
descriptor encapsulation, support for multi-model and multi-domain
environments, and robust security and privacy protections. The DA-
ITN is also designed to support both centralized and decentralized
modes of operation, offering high adaptability across different
deployment contexts.
4.2. Data Plane
On the other hand, the Data Plane of the DA-ITN provides support for
mobility management and intelligent scheduling, enabling the dynamic
creation of rendezvous points where data, queries, models, and
compute infrastructure can be brought together with minimal latency
and overhead. Thanks to its infrastructure-agnostic nature, the DA-
ITN leverages existing communication networks—such as those offered
by 6G or edge service providers—as tools to enable model mobility,
data mobility, and agent-to-agent coordination. This capability is
essential for supporting scenarios where mobility or geographical
dispersion of resources would otherwise lead to performance
degradation or inefficiency.
4.3. Operation and Management Plane (OAM)
Finally, the Operations and Management (OAM) layer plays a critical
role in supporting the day-to-day operational needs of the AI
ecosystem. This layer is responsible for a wide range of essential
functions, including monitoring, registration, configuration, fault
management, and lifecycle maintenance of models, data, and services.
It serves as the management backbone of the DA-ITN, ensuring
transparency, accountability, and operational control throughout the
system.
Consider the scenario of an AI model training client deploying a
model into the ecosystem for training. Through the capabilities of
the OAM layer, the client can continuously monitor the training
performance of their model in real time—tracking key performance
indicators such as convergence speed, loss metrics, resource usage,
and network traversal. The model’s location within the ecosystem can
be dynamically tracked, allowing clients to know exactly where their
model resides or which data centers or devices it is interacting
with.
Akhavain & Moussa Expires 23 January 2026 [Page 23]
Internet-Draft AI-Internet July 2025
Moreover, the OAM layer enables interactive control. Clients can use
it to adjust training parameters on the fly, such as learning rates,
data sampling strategies, or the choice of collaborative partners.
They can even pause, resume, or terminate the training process at
will, giving them full agency over the lifecycle of their models.
This flexibility is crucial in adaptive AI systems where
responsiveness and real-time decision-making are valued.
In this way, the OAM layer effectively functions as the control
dashboard or command-line terminal of the DA-ITN-enabled AI
ecosystem. Whether through a graphical user interface (GUI), APIs,
or automated orchestration scripts, the OAM provides the necessary
tools for fine-grained management, status visualization, and policy
enforcement.
Beyond individual model control, the OAM layer also facilitates
system-wide coordination and policy administration—ensuring
compliance with service-level agreements (SLAs), enforcing data
governance policies, and managing access rights across domains. It
plays a foundational role in building trustworthy, maintainable, and
operationally efficient AI services across diverse infrastructure
providers and stakeholders.
4.4. Summary of the DA-ITN General Framework
Accordingly, the DA-ITN is well positioned and designed to provide a
range of intelligent services that can be leveraged by both AI
clients and service providers. It forms the foundation for a
scalable, decentralized AI internet, driving the emergence of a
vibrant and cooperative agent-based ecosystem. By enabling the
formation of adaptive and intelligence-driven topologies and being
agnostic to the infrastructure, the DA-ITN facilitates more effective
decisions in AI training, inference, and agent-to-agent
interactions—ultimately supporting a more responsive, resilient, and
capable AI infrastructure that can scale with future demands.
In the following sections, we provide more detailed insights into the
specific DA-ITN components that support training and inference
services.
5. DA-ITN for Training
The training architecture of the DA-ITN consists of five layers: i)
the terminal layer; ii) the network layer; iii) the data, resource,
and reachability topology layer (DRRT); iv) the DA-ITN intelligence
layer; and v) the OAM layer. The layers interact together using
control and data planes (CP and DP respectively) as is discussed in
the following.
Akhavain & Moussa Expires 23 January 2026 [Page 24]
Internet-Draft AI-Internet July 2025
First, the network layer, which is at the heart of the DA-ITN
training system, is responsible for providing connectivity services
to the four other layers. It provides both control and data plane
connectivity to enable various services. The network layer connects
to the terminal and DRRT layers via CP and DP links, and connects to
the intelligence layer via a CP link only. The network layer also
enables the overarching OMA layer by enabling a multi-layer
connectivity structure.
Second, the terminal layer, the lowest layer in the architecture,
contains the terminal components of the system. These include nodes
that host the training data, facilities that provide computing
resources where the model can be trained, and newly proposed
components that we refer to as the model performance verification
units (MPVUs), where the model testing phase takes place. It should
be noted that facilities providing computing resources come in
various forms including private property such as personal devices, in
a distributed form such as in the case of mobile edge computing in 6G
networks, on the cloud such as on the AWS cloud, or anywhere that is
accessible by both the data and the model and holds sufficient
compute for training. As for the MPVU, this unit is important when
conducting distributed training as it takes the role of a trusted
proxy node that holds a globally constructed testing dataset - the
dataset is constructed via collecting sample datasets from each
participating node - and provides safe and secure access to it.
Last, the terminal layer also hosts the AI training clients.
Akhavain & Moussa Expires 23 January 2026 [Page 25]
Internet-Draft AI-Internet July 2025
The terminal layer relies on the network layer to build an
overarching knowledge-sharing network. To be exact, the network
layer provides three main services to the terminal layer, namely: i)
moving models and data between the identified rendezvous compute
points where training can happen; ii) moving the models towards the
MPVU units where performance evaluation can be conducted to keep
track of the training progress; and iii) enabling AI training clients
to submit their models, monitor the training progress, modify
training requirements, and collect the trained models. Control and
data traffic exist for each one of these services. For instance,
moving a model toward a compute facility requires authorization for
the utility of the resources; hence, authorization control data is
required to be exchanged over the Terminal-NET CP links. The service
also requires the physical transmission of the model to the computing
facility which is handled over the Terminal-NET DP link. Similar
situations can be extrapolated for the other provided services. It
is worth noting that the network layer can be built on top of any
access network technology including 3GPP cellular networks, WiFi,
wireline, peer-to-peer, satellites, and non-terrestrial networks
(NTN), or a combination of the above. These networks can be used to
build dedicated CP and DP links strictly designed to enable the DA-
ITN training system and its services.
Third, the DRRT layer holds all the information required to make
accurate decisions and sits between the intelligence layer and the
terminal layer. It consists of a DRRT-manager (DRRT-M) unit which is
the brain of this layer and interfaces with the other layers over CP
links. The DRRT layer provides the intelligence layer with
visibility and accessibility services to specific information about
the underlying terminal layer's data, resource, and reachability
status. To be exact, the DRRT layer holds information regarding the
type, quality, amount, age, dynamics, and any other essential
information about the data available for training. It also provides
reachability information of the participating nodes to avoid
unnecessary communication overhead and packet droppage. Lastly, the
DRRT also contains information about computing resources and MPVUs
such as resource availability, location, trustworthiness, and nature
of the testing datasets hosted at the different MPVF units.
The DRRT relies on the network layer to collect the necessary
information to build the Global-DRRT topology (G-DRRT). The G-DRRT
is a none model specific topology, it is rather a large canvas that
holds the high-level view of the data, resource, and reachability
information. The DRRT-M unit in the DRRT layer communicates with the
network layer over CP links to manage the collection process of the
required information. For instance, the DRRT-M may instruct the 3GPP
component of the network layer to convey connectivity information
about the data nodes, or it might instruct it to wake up an ideal
Akhavain & Moussa Expires 23 January 2026 [Page 26]
Internet-Draft AI-Internet July 2025
data provider device. It might also instruct satellites to share GPS
locations of mobile data nodes. The collected data by the network
layer are then shipped toward the G-DRRT component of the DRRT layer
over DP links. The G-DRRT hosts intelligence that allows it to
convert the collected information into useful global topology ready
to provide services to the AI training clients.
Fourth, The Intelligence Layer is responsible for hosting the
decision-making logic required to fulfill the specific training
requirements submitted by clients. It contains several key
components that collaboratively determine how, where, and whether
training should proceed. Among these is the Model Training Route
Compute Engine (MTRCE), which identifies suitable rendezvous points
between models and data. Another critical component is the Training
Feasibility Assessment Module (T-FAM), which functions as an
admission controller—evaluating whether a submitted model, given its
requirements and constraints, can be effectively trained within the
available ecosystem.
Additional intelligent modules include the Training Algorithm
Generator (TAG) and the Hyperparameter Optimizer (HPO). These
components are responsible for selecting the appropriate training
paradigm—such as reinforcement learning (RL), federated learning
(FL), or supervised learning (SL)—as well as determining other
configuration details like the number of training epochs, batch size,
and optimization strategy. The Intelligence Layer also interfaces
with both the Network Layer and the DRRT Layer to acquire the context
needed for effective decision-making. From the Network Layer, it
receives control data over CP links—this includes model structure,
target accuracy, convergence time, monitoring instructions, and
client-specified training preferences. It also receives feedback
data that allows the TAG and HPO modules to refine their
recommendations dynamically.
Meanwhile, the Intelligence Layer connects to the DRRT Layer via both
CP and DP links to access up-to-date visibility into training data,
compute resources, and node reachability. This information is
essential for components like MTRCE and T-FAM to make routing and
admission decisions. To further enhance decision efficiency, the
Intelligence Layer may also host a DRRT-Adaptability Unit (DRRT-A).
This optional module works in coordination with MTRCE, T-FAM, and the
DRRT Manager (DRRT-M) to generate model-specific DRR
topologies—lightweight, targeted representations carved out from the
global DRR topology. These customized topologies are optimized to
reduce computational overhead and accelerate decision-making for
individual training requests.
Akhavain & Moussa Expires 23 January 2026 [Page 27]
Internet-Draft AI-Internet July 2025
Last, the OAM layer, which spans all the layers, is mainly intended
as a management layer to configure the training components, the
connectivity of the network layer, and enable feedback functions
essential for progress monitoring and model localization and
tracking. It is also intended to provide feedback to the clients
about their submitted models every step of the way.
6. DA-ITN for Inference
The Inference architecture of the DA-ITN provides automated AI
inference services using a similar structure to the training
architecture with a few differences.
First, unlike training, where the moving components are models and
training data, and the rendezvous points are computing facilities, in
inference, models/agents and queries/tasks are the moving components
that require networking, and the rendezvous points are model hosting
facilities.
Second, in inference, the clients are both the task/query owners as
well as the model/agent owners. Query owners are the inference
service users who send their queries into the system and collect the
resulting inference. On the other hand, model owners are divided
into two types. The first type consists of model hosts - the model
used for inference does not have to be owned by them, but it is
hosted on their computing facilities. The second type consists of
model providers - they develop models and deploy them either at their
own facilities or at model hosts. Model owners are represented in
the terminal layer as model deployment facility providers (MDFP)
which are distributed across the global network.
Third, the network layer provides the following services to the
terminal layer using its control and data planes: i) model mobility
from model generators to model hosts; ii) query routing towards
models deployed on MDFPs; iii) model mobility from one location to
the other in case of load balancing situations; iv) model mobility
towards re-training and calibration facilities which may be hosted on
MVPF units; v) query response and inference result routing towards
the query owners or any indicated destination around the globe; and
vi) feedback and monitoring information to model and query owners.
Fourth, the DRRT layer is replaced by a query, resource, and
reachability topology (QRRT) layer. It provides the same type of
services to the other layers; however, from the point of view of
queries and models. That is, it provides information about both
models and queries such as i) for models: model locations, model
capabilities, current loading conditions, inference speed, inference
accuracy, model reachability and accessibility (i.e., reachability
Akhavain & Moussa Expires 23 January 2026 [Page 28]
Internet-Draft AI-Internet July 2025
and accessibility of the MDFP), and ii) for query: query patterns and
dynamics (could be associated with a geographical location), query
types, and reachability status of query owners for response
communication purposes. The information collected by the QRRT is
used to make appropriate decisions about model deployment and
distribution strategies, query-to-model routing decisions, and
response routing decisions. The QRRT has a management function that
coordinates with the Network layer to collect the required
information from the terminal layer to build the Global-QRRT
(G-QRRT). It also optionally communicates with the QRRT-adaptation
(QRRT-A) function in the inference intelligence layer to build query-
or model-specific QRRTs.
Last, the inference intelligence layer hosts different intelligent
decision-making components including the Query Feasibility Assessment
Module (Q-FAM), the Query Inference Route Compute Engine (QIRCE), and
the Model Deployment Optimizer module (MDO). Just like with the
training, these components make decisions based on the QRRT. For
instance, the Q-FAM hosts intelligence that acts as an admission
control unit that evaluates if a submitted query could be serviced
given the current network inference capabilities. The QIRCE handles
query routing towards the correct models while observing loading
conditions. Furthermore, the MDO module acts as an admission
controller for newly submitted models where it evaluates deployment
feasibility based on the submitted model's architecture, compute
requirements, and storage requirements. It matches these
requirements to the currently available resources indicated in the
QRRT and makes an admittance decision. It also handles deployment
location optimization, aiming to minimize query response time and
cost for inference.
7. DA-ITN-Facilitation Agentic Networks
While agent-to-agent interaction is commonly associated with task-
oriented collaboration—often relying on inference chaining as
discussed in the inference section—we propose that this only reflects
one side of the coin. We believe there is a transformative
alternative: collaborative agent training, where agents not only work
together to complete tasks, but also contribute to each other's
learning and evolution. This paradigm marks a significant shift from
traditional models and positions the DA-ITN as an ideal enabler of a
truly agentic future, where intelligent agents can grow, adapt, and
improve continuously through structured cooperation.
It is important to distinguish clearly between collaborative training
and task-based collaboration. In task-based collaboration, agents
exchange data or partial inferences related to the execution of a
specific, external objective—such as processing a query or generating
Akhavain & Moussa Expires 23 January 2026 [Page 29]
Internet-Draft AI-Internet July 2025
an output. Their internal models remain unchanged; they simply
contribute to a shared computational goal. In contrast,
collaborative training focuses on internal evolution: the goal is not
to solve an external task, but to enhance the capabilities of the
participating agents themselves.
In a collaborative training setup, agents may exchange model
parameters, training datasets, or knowledge representations. They
may engage in distributed training paradigms such as federated
learning, where learning happens locally and updates are shared
globally, or continual learning, where agents adapt over time based
on new experiences. They may also employ knowledge distillation or
transfer learning, where more advanced "teacher agents" guide
"student agents" through structured training programs. One can even
envision a highly dynamic and autonomous system where agents attend
“agent schools”—virtual environments where they gather to learn, be
tested, and graduate. In this imagined scenario, teacher agents
would be responsible for training student agents, evaluating their
performance, and possibly issuing certifications or verifiable
credentials that guarantee the agent’s competencies and readiness for
deployment. These credentials serve trust foundations in the broader
agent ecosystem, ensuring that certified agents can be reliably
selected and trusted by inference clients or other agents.
To support such a vision, a wide range of new functional and
technical requirements must be addressed. These include secure model
sharing, certification and validation infrastructure, identity
management, trust negotiation, resource discovery for training, and
scheduling of learning sessions. Fortunately, many of these
requirements align naturally with the capabilities and components of
the DA-ITN architecture—including its support for mobility,
discovery, descriptor sharing, trust enforcement, dynamic rendezvous,
and topology management.
8. Security Considerations
Security considerations are as outlined within the document under the
privacy and security requirements
9. IANA Considerations
This document has no IANA actions.
Akhavain & Moussa Expires 23 January 2026 [Page 30]
Internet-Draft AI-Internet July 2025
10. Conclusions
As AI continues to evolve and integrate into every facet of modern
life, it becomes increasingly clear that the supporting
infrastructure must evolve with it. The training and inference
processes—central to the success of AI—are no longer simple, isolated
tasks; they are complex, distributed, and require intelligent
coordination across data, compute, and communication domains.
The DA-ITN architecture offers a forward-looking response to this
complexity by providing a cohesive, scalable, and intelligent network
ecosystem. With its dedicated control, data, and operations &
management planes, DA-ITN not only supports the technical
requirements of training and inference but also addresses critical
concerns such as mobility, privacy, trust, and agent collaboration.
Ultimately, DA-ITN lays the foundation for a new generation of AI-
native networks—capable of enabling persistent learning, dynamic
agent interaction, and decentralized intelligence at scale. As we
move toward an AI-driven future, such architectures will be essential
for building reliable, trustworthy, and efficient AI ecosystems.
Contributors
Arashmid Akhavain
Huawei Canada
Email: arashmid.akhavain@huawei.com
Hesham Moussa
Huawei Canada
Email: hesham.moussa@huawei.com
Tong Wen
Huawei
Email: tongwen@huawei.com
Authors' Addresses
Arashmid Akhavain
Huawei Canada
Email: arashmid.akhavain@huawei.com
Hesham Moussa
Huawei Canada
Akhavain & Moussa Expires 23 January 2026 [Page 31]
Internet-Draft AI-Internet July 2025
Email: hesham.moussa@huawei.com
Akhavain & Moussa Expires 23 January 2026 [Page 32]