Data Collection Requirements and Technologies for Digital Twin Network
draft-zcz-nmrg-digitaltwin-data-collection-00
This document is an Internet-Draft (I-D).
Anyone may submit an I-D to the IETF.
This I-D is not endorsed by the IETF and has no formal standing in the
IETF standards process.
The information below is for an old version of the document.
| Document | Type |
This is an older version of an Internet-Draft whose latest revision state is "Expired".
|
|
|---|---|---|---|
| Authors | Cheng Zhou , Danyang Chen , Pedro Martinez-Julia | ||
| Last updated | 2022-07-10 | ||
| RFC stream | (None) | ||
| Formats | |||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | I-D Exists | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
draft-zcz-nmrg-digitaltwin-data-collection-00
Internet Research Task Force C. Zhou
Internet-Draft D. Chen
Intended status: Informational China Mobile
Expires: 11 January 2023 P. Martinez-Julia, Ed.
NICT
10 July 2022
Data Collection Requirements and Technologies for Digital Twin Network
draft-zcz-nmrg-digitaltwin-data-collection-00
Abstract
The Digital Twin Network is a network system with Physical Network
and Twin Network, which can be mapped interactively in real time.
The construction of Digital Twin Network requires real-time data of
Physical Network to update the state of Twin Network. This document
aims to describe the data collection requirements and provide data
collection methods or tools to build the data repository for digital
twin network.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 11 January 2023.
Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved.
Zhou, et al. Expires 11 January 2023 [Page 1]
Internet-Draft Network Working Group July 2022
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Definitions and Acroyms . . . . . . . . . . . . . . . . . . . 3
3. Data Collection Requirements for Digital Twin Network . . . . 3
3.1. Target Driven and On-demand Collection . . . . . . . . . 3
3.2. Diverse Tools for Various Data . . . . . . . . . . . . . 4
3.3. Lightweight and Efficient Collection . . . . . . . . . . 5
3.4. Open and Standardized Interfaces . . . . . . . . . . . . 5
3.5. Naming for Caching . . . . . . . . . . . . . . . . . . . 6
3.6. Efficient Multi-Destination Delivery . . . . . . . . . . 6
4. An Efficient Data Collection Method for Digital Twin
Network . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2. Efficient Data Collection Mechanism . . . . . . . . . . . 6
4.3. Data Collection Process . . . . . . . . . . . . . . . . . 8
5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
6. Security Considerations . . . . . . . . . . . . . . . . . . . 10
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 10
8.1. Normative References . . . . . . . . . . . . . . . . . . 10
8.2. Informative References . . . . . . . . . . . . . . . . . 10
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10
1. Introduction
With the deployment of Internet of Things (IoT), cloud computing and
data center, etc., the scale of the current network is expanded
gradually. However, the increase of network scale leads to also
increasing the complexity of the current network, and it induces
plenty of problems. In order to improve the autonomy ability of
network and reduce potential negative effects on physical and virtual
networks, we consider that an endogenous intelligent and autonomous
network architecture which achieves self-optimization and decision is
indispensable (in general, self-management and self-operation). The
digital twin technology answers to the challenge of building self-
management systems because it can optimize and validate policies
through real-time and interactive mapping with physical
entities.[I-D.irtf-nmrg-network-digital-twin-arch]
Zhou, et al. Expires 11 January 2023 [Page 2]
Internet-Draft Network Working Group July 2022
Data is the cornerstone required for constructing a digital twin for
a network, namely a Digital Twin Network (DTN). In the face of large
network scale, data collection, storage and management are faced with
great challenges. So, data collection methods and tools should meet
the requirements of target-driven, diversity, lightweight and
efficiency, while being open and standardized. Among all the
requirements, achieving a lightweight and efficient data collection
method is of the most importance. If the full-data collection method
is adopted, huge storage space and bandwidth resource is needed,
especially for complex scenarios that require real-time data and
traffic from multi-source and heterogeneous devices. Therefore, it
is extremely important to agree on lightweight and efficient data
collection, aggregation, and correlation methods, toward building the
telemetry data transmission, processing, and storage required to
build a DTN system.
2. Definitions and Acroyms
PN: Physical Network
IMC: Instruction Management Center
DSC: Data Storage Center
DTN: Digital Twin Network
TSE: Telemetry Streaming Element
RDF: Resource Description Framework
CPE: Complex Event Processing
3. Data Collection Requirements for Digital Twin Network
3.1. Target Driven and On-demand Collection
The monitoring data of a network is the basis to build a DTN system.
Such data is collected from physical and virtual networks. It
includes, but is not limited to, the following types:
* Provisional and operational status of physical or virtual devices,
as well as the network topology with all network elements.
* Running status of physical, logical, or virtual ports and links.
* Logs and events records of all the network elements.
Zhou, et al. Expires 11 January 2023 [Page 3]
Internet-Draft Network Working Group July 2022
* Statistics (packet loss, traffic throughput, latency, etc.) of
flows and ports.
* Various data regarding users and services.
* Lift-cycle operation data of all network elements.
* All above data in time series.
The collection of network data for maintaining a DTN should be in
target-driven and on-demand mode. It is not always necessary to
collect complete network data list above because of the high cost of
resources (CPU, memory, bandwidth etc.). The type, frequency and
method of data collection aim to meet the application of a DTN
depends on the specific network topology and application
requirements.
3.2. Diverse Tools for Various Data
The different types of network data used to maintain a DTN have
several characteristics. Some data (e.g. port statistics, key link
info, etc.) requires higher collecting frequency, and some data (e.g.
flow status, link fault, etc.) needs to be of higher level of real-
time. Some data (e.g. device status, port statistics, etc.) can be
collected directly and simply via normal tools, while some data (e.g.
per-flow latency, traffic matrix, etc.) can only be acquired through
complex network measurement. Therefore, multiple tools or methods
are needed to collect the massive data required to build the DTN
entity.
Currently, some widely-used tools, such as SNMP, NetConf, Telemetry,
INT (In-band Network Telemetry), DPI (Deep Packet Inspection), etc.
can be candidate tools to collect data for digital twin network.
Going forward, it is necessary to study new data collection
technology in the following aspects in combination with the data
requirements of network application for DTN:
* High-performance data collection technology based on programmable
circuits.
* Measurement methods for complex network data such as network
performance and network traffic.
* Collaborative data collection technology for multiple data
sources.
Zhou, et al. Expires 11 January 2023 [Page 4]
Internet-Draft Network Working Group July 2022
* Distributed and collaborative data collection technology for
complex network, and the time synchronization problem of data
acquisition.
3.3. Lightweight and Efficient Collection
Data collection tools and methods should be as lightweight as
possible, so as to reduce the occupation of network equipment
resources and ensure that data collection does not affect the normal
operation of the network. The major requirements are list as below.
* Data collection tools and methods needs to improve efficiency of
execution, reduce the cost of computing, storage and communication
bandwidth.
* The collection of redundant data should be avoided or minimized.
* For the data set that needs to be collected, make full use of the
data compression technology, to reduce the resource cost in the
collection phase.
3.4. Open and Standardized Interfaces
Data collection interface used to build the DTN should be open and
standardized to help avoid either hardware or software vendor lock,
and achieve inter-operability. The major requirements of data
collection interfaces are:
* Support configuration management, including the data collection
protocol, frequency or period, etc.
* Support several speed options (e.g. minute-level, 10-second level,
second level (near real time), and real time level) to accommodate
different data requirements from applications.
* Be extensible so that more features can be added with limited
parameter changes and with backward compatibility.
* Be able to provide secure and reliable information exchange
mechanism.
Zhou, et al. Expires 11 January 2023 [Page 5]
Internet-Draft Network Working Group July 2022
3.5. Naming for Caching
Both raw network data and knowledge items obtained from monitoring
must be able to be addressed uniquely. This means to give a unique
identifier or "name" to each data or knowledge item that references
it. This name will be used by caching mechanisms to store the data
and provide it for clients that request it, which will also use such
name.
3.6. Efficient Multi-Destination Delivery
The maintenance of DTN systems will not be the sole purpose of
monitoring information and knowledge communication. Other
applications would also request raw telemetry data or knowledge
items. They can use the name to identify it. The telemetry system,
following the recommendations of RFC 9232 [RFC9232], will deliver the
requested data or knowledge items to the requesters as much
efficiently as possible. On the one hand, items will be provided by
the closest cache to the destination of the data. On the other hand,
items will be replicated in the best nodes, following an efficient
multi-cast spanning tree. Different underlying protocols can be used
to achieve this mechanism.
4. An Efficient Data Collection Method for Digital Twin Network
4.1. Overview
The system that manages the DTN maps, in real time, the PN to the
DTN. However the existing methods collect the full data from the PN
for modeling, and do not consider problems like time-lag,
insufficient storage resources, low computational efficiency and
waste of bandwidth resources caused by data transmission. In order
to solve these problems, this section introduces an efficient data
collection method for maintaining the DTN. This data collection
method is based on sending instructions to the elements of the PN for
them to pre-process the data (data cleaning or knowledge
representation) before sending it back to be applied to the DTN.
4.2. Efficient Data Collection Mechanism
The management system structure consists of the PN and the DTN. The
PN includes multiple Data Storage Centers (DSC) and Telemetry
Streaming Element (TSE), and the DTN includes the Instruction
Management Center (IMC) and Data Storage Center (DSC). The TSE has
multiple functions, including data collection, data aggregation, data
correlation, knowledge representation and query, etc. In addition, a
Complex Event Processing (CEP) engine is integrated into TSE to
perform queries to the streamed data. The IMC has two functions. On
Zhou, et al. Expires 11 January 2023 [Page 6]
Internet-Draft Network Working Group July 2022
the one hand, it is used to manage the registration of the DSC in the
PN side, and its registration information can include various key
information such as the IP address of the DSC in the PN side, chosen
data type, and various index names in the data, data source name and
data size, etc. On the other hand, it is used to adaptively
configure data collection instructions according to the collection
requirements of the DSC in the DTN side and search for IP addresses
to send instructions. The instruction-carrying information includes
rule-based mathematical expressions, executable models in .exe
format, dynamic collection frequency, parameter lists, program text
files in .m format, text files with parameter configuration, and
other types of files. Instructions are flexible and programmable,
and can be created, modified, combined, and deleted at any time
according to requirements. When the DSC of the DTN side requests
data to the IMC, the IMC searches the IP address of the DSC in the
database with the registration information, which is built according
to critical information, such as data type and data name, and
functional instructions for data processing or knowledge
representation can be implemented depending on the demand
configuration. The DSC of the DTN side stores the effective
information after data processing and knowledge representation
returned by the TSE.
The DSC in the PN side has two functions. On the one hand, it stores
data of various types, such as performance indicators, operational
status, log, traffic scheduling, business requirements, etc. On the
other hand, it has the function of automatically parsing the
instructions sent by the TSE. Then the operating environment of the
instruction is configured according to the instruction needs, and
data processing or knowledge representation is performed based on the
instruction. Data processing mainly includes data cleaning, filling
missing data, normalization, conflict verification, etc. Knowledge
representation refers to the representation of the original data as a
data structure that can be used for efficient computation. Such
representation results are closer to machine language, which is
conducive to the rapid and accurate construction of the model. The
role of knowledge representation is to represent the original data as
a data structure that can be used to efficiently calculate. Such
representation results closer to the machine language, which is
conducive to the rapid and accurate construction of the model.
Zhou, et al. Expires 11 January 2023 [Page 7]
Internet-Draft Network Working Group July 2022
+------------------------------+ +-----------------------+
| Physical Network | | Digital Twin Network |
| +-----+ +-----+ +------+ | | +------+ +-------+ |
| | | | | | | | | | | | | |
| | DSC |... | DSC | | TSE | | | | IMC | | DSC | |
| | | | | | | | | | | | | |
| +-+---+ +--+--+ +---+--+ | | +---+--+ +----+--+ |
| | | | | | | | |
+------------------------------+ +-----------------------+
| | | | |
| 1.1. Register | | |
+-----------+---------> | |
| | | | |
| | 1.2. Register | |
| +---------> | |
| | | 1.3. Register | |
| | +---------------> |
| | | 2. Data req. |
| | | <----------+
| | | 3. Query and instruction |
| | | configuration |
| | | + |
| | 4. Send instructions |
| | <---------------+ |
| | | | |
| | 5. Parse and execute | |
| | instruction | |
| 6. Data subscript. | | |
<---------------------+ | |
| 7. Knowledge | | |
| representation | | |
| 8. Data pushing | | |
+---------------------> | |
| | 9. Data aggregation and | |
| | correlation | |
| | | 10. Send processed data |
| | +-------------------------->
| | | | |
Figure 1: Data Collection Process
4.3. Data Collection Process
The specific process is as follows:
* The DSC in the PN side registers into the TSE. The TSE registers
into the IMC. Both provide their IP addresses, the data type, the
data source, the data size, etc.
Zhou, et al. Expires 11 January 2023 [Page 8]
Internet-Draft Network Working Group July 2022
* The DSC in the DTN side sends the data collection request to the
IMC.
* According to the data collection request, the IMC intelligently
queries the registration addressing information and configures the
data processing instruction.
* The IMC in the DTN side sends the corresponding instruction
according to the query result to the TSE.
* After receiving the instructions, the TSE parses them and executes
them. The query function can be performed by the CEP engine,
which receives all telemetry data and processes it with all
queries provided.
* The TSE sends data subscription to DSC in the PN side.
* The DSC in the PN side represents the data semantically in RDF
form or sends the data in raw form to the TSE for it to make the
semantic representation.
* The DSC in the PN side pushes the data or knowledge item to the
TSE.
* The TSE aggregates and correlates the collected data or knowledge
items. Then, according to the actual needs, generates aggregated
data or knowledge items.
* The TSE sends the resulting data or knowledge items to the DSC in
the DTN side.
5. Summary
This draft describes the requirements for data collection and
provides the data collection methods or tools required to build the
data repository for maintaining DTN systems. These data collection
methods or tools should meet the requirement of target-driven,
diversity, lightweight and efficiency, while being open and
standardized. Among all the requirements, lightweight and efficiency
requirements are the most important. Thus, this draft provides a
lightweight and efficient method for data collection that is
particularly optimized for maintaining DTN systems. Going forward,
more methods (transformation and aggregation functions) and tools
(solutions) shall be studied to extend the contents of this draft.
Zhou, et al. Expires 11 January 2023 [Page 9]
Internet-Draft Network Working Group July 2022
6. Security Considerations
TBD.
7. IANA Considerations
This document has no requests to IANA.
8. References
8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC9232] Song, H., Qin, F., Martinez-Julia, P., Ciavaglia, L., and
A. Wang, "Network Telemetry Framework", RFC 9232,
DOI 10.17487/RFC9232, May 2022,
<https://www.rfc-editor.org/info/rfc9232>.
8.2. Informative References
[I-D.irtf-nmrg-network-digital-twin-arch]
Zhou, C., Yang, H., Duan, X., Lopez, D., Pastor, A., Wu,
Q., Boucadair, M., and C. Jacquenet, "Digital Twin
Network: Concepts and Reference Architecture", Work in
Progress, Internet-Draft, draft-irtf-nmrg-network-digital-
twin-arch-00, 21 March 2022,
<https://www.ietf.org/archive/id/draft-irtf-nmrg-network-
digital-twin-arch-00.txt>.
Authors' Addresses
Cheng Zhou
China Mobile
Beijing
100053
China
Email: zhouchengyjy@chinamobile.com
Danyang Chen
China Mobile
Beijing
100053
China
Zhou, et al. Expires 11 January 2023 [Page 10]
Internet-Draft Network Working Group July 2022
Email: chendanyang@chinamobile.com
Pedro Martinez-Julia (editor)
NICT
4-2-1, Nukui-Kitamachi, Koganei, Tokyo
184-8795
Japan
Email: pedro@nict.go.jp
Zhou, et al. Expires 11 January 2023 [Page 11]