Internet-Draft semantic-sdn-mom March 2022
Bellavista, et al. Expires 4 September 2022 [Page]
Workgroup:
TODO Working Group
Internet-Draft:
draft-bellavista-semantic-sdn-mom-00
Published:
Intended Status:
Informational
Expires:
Authors:
P. Bellavista
University of Bologna
L. Foschini
University of Bologna
L. Patera
University of Bologna
M. Fogli
University of Ferrara
C. Giannelli
University of Ferrara
C. Stefanelli
University of Ferrara
D.Z. Lou
Huawei

A Framework for QoS-Enabled Semantic Routing in Industrial Networks

Abstract

Industrial networks pose unique challenges in realizing a communication substrate on the shop floor. Such challenges are due to strict Quality of Service (QoS) requirements, a wide range of protocols for data exchange, and highly heterogeneous network infrastructures. In this regard, this document proposes a framework for QoS-enabled semantic routing in industrial networks. Such a framework aims at providing loosely-coupled, asynchronous communications, fine-grained traffic management (delivery semantics and flow priorities), and in-network traffic optimization.

Discussion Venues

This note is to be removed before publishing as an RFC.

Source for this draft and an issue tracker can be found at https://github.com/fglmtt/draft-bellavista-semantic-sdn-mom.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 4 September 2022.

1. Introduction

This Internet Draft defines a framework for Quality of Service (QoS)-enabled semantic routing in industrial networks. The term "semantic routing" refers to a form of routing based on additional semantics other than mere IP addresses [I-D.draft-farrel-irtf-introduction-to-semantic-routing-03]. Along with the semantics carried in packet headers, such routing may also depend on policy coded in, configured at, or signaled to network devices. A network device is an element that receives/transmits packets and performs network functions on them, such as forwarding, dropping, filtering, and packet header (or payload) manipulation, among others. Network devices may operate in, above, and below the network layer.

The framework described in this draft uses the overlay networking to provide a semantic routing substrate that operates both at the application and network level.

At the application level, the framework consists of Message-Oriented Middleware (MOM) and Application Gateways (AGWs). The MOM allows decoupling senders and receivers, sorts messages in topics of interest, and provides delivery semantics (e.g., at most once, at least once, and exactly once). The AGWs sit nearby industrial machines that are not natively compliant with the protocols the framework relies on. For example, some legacy industrial machines may not even support IP-based communications. It is worth mentioning that the typical lifetime of industrial equipment is 10 to 15 years (even longer sometimes), and in many cases, the software cannot be updated due to manufacturers' policy. Accordingly, AGWs translate the plethora of (proprietary) protocols that coexist on the shop floor towards the one(s) used by the framework.

At the network level, the framework combines two paradigms: Software-Defined Networking (SDN) [RFC7426] and In-Network Processing (INP) [ZILBERMAN2019], [PORTS2019]. Although the MOM enables critical features in message dispatching, it does not control how packets flow through network devices along routing paths. This is where SDN comes in. Specifically, the SDN controller computes optimal routes to meet the QoS requirements and configures network devices accordingly. The term INP refers to executing end-host programs within network devices. Such INP-enabled network devices operate at a line rate, processing packets as they traverse them without increasing the overall network load. Given that the SDN controller holds a network-wide view, it also knows which network devices support INP and which do not. The SDN controller may redirect flows towards target INP-enabled network devices based on the processing functions they provide.

The objectives that the framework targets are the following:

  • Loosely-coupled, asynchronous communications;
  • Fine-grained traffic management (delivery semantics and flow priorities);
  • In-network traffic optimization.

The remainder of this draft is structured as follows. First, Section 2 details the target scenario. Then, Section 3 provides the requirements of the target scenario. Lastly, Section 4 presents the principles and design guidelines of the framework and Section 5 depicts its architecture.

2. Target Scenario

Traditionally, a shop floor includes industrial machines, Programmable Logic Controllers (PLCs), and Human-Machine Interfaces (HMIs). Typically, industrial machines are equipped with sensors and actuators, PLCs control manufacturing processes, and human operators interact with and receive feedback from industrial machines through HMIs. In such legacy industrial networks, the message dispatching was primarily oriented to monitor operational- and safety-related machine parameters.

Nowadays, the shop floor has become more articulated due to the advent of the Industrial Internet of Things (IIoT). On the one hand, IIoT devices enable business-critical services (e.g., predictive maintenance) cost-effectively. On the other hand, they dramatically increase overall network traffic volume, infrastructure heterogeneity, and cyber security threats.

The heterogeneity is not only about the industrial equipment itself but also in how such equipment disseminates information. The plethora of (proprietary) protocols that machines use to exchange data makes machine-to-machine communications challenging.

Additionally, the shop floor may include dynamic industrial equipment (e.g., automated guided vehicles) that communicate on the move. Such dynamic equipment may abruptly migrate communications across different access points according to the physical location at a given time.

Therefore, modern industrial environments stress the network infrastructure more than traditional ones, where network traffic was fairly limited to mission-critical information generated by fixed network equipment.

In fulfilling current industrial guidelines for cyber security (e.g., IEC 62443 [IEC62443]), the industrial topology should consist of several shop floor subnets and a control room subnet. Figure 1 depicts an industrial topology compliant with such guidelines.

 Control Room Subnet
+---------------------------------------------------------------+
|                                                               |
|  +------------+ +------------+ +------------+ +------------+  |
|  |    SDN     | |    AGW     | |    MOM     | |    INP     |  |
|  | Controller | | Controller | | Controller | | Controller |  |
|  +-----+------+ +-----+------+ +------+-----+ +------+-----+  |
|        |              |               |              |        |
|        +--------------+-------+-------+--------------+        |
|                               |                               |
|                           +---+---+                           |
|                           |  SGW  |                           |
+---------------------------+---+---+---------------------------+
                                |
----------------+---------------+----------------+---------------
                |                                |
+-----------+---+---+----------+ +-----------+---+---+----------+
|           |  SGW  |          | |           |  SGW  |          |
|           +---+---+          | |           +---+---+          |
|               |              | |               |              |
|  +------------+-----------+  | |  +----------+ |              |
|  |           AGW          |  | |  |   AGW    +-+--+------+    |
|  +-+------+------+------+-+  | |  +-+------+-+    |      |    |
|    |      |      |      |    | |    |      |      |      |    |
|  +-+------+------+------+-+  | |  +-+------+-+  +-+------+-+  |
|  |        Machines        |  | |  | Machines |  | Machines |  |
|  +------------------------+  | |  +----------+  +----------+  |
|                              | |                              |
+------------------------------+ +------------------------------+
 Shop Floor Subnet 1              Shop Floor Subnet N
Figure 1: Target network topology

Note that:

  • With "Machines", we refer to any shop floor entity (e.g., industrial machines, IIoT devices, PLCs, and so on) doing networking. This document makes no distinction among shop floor entities because AGWs can normalize their outputs if needed;
  • Each shop floor subnet may be provided with one or more AGWs, depending if machines support the protocols used by the framework;
  • Each subnet is provided with a Subnet Gateway (SGW), which is a network device. Additional network devices may be placed between different subnets as well as within subnets.

The network devices interconnecting the subnets form the industrial network backbone. The outcome is a multihop multipath topology providing point-to-point connections with differentiated performance.

The framework described in this document targets the scenario depicted in Figure 1. The framework components (i.e., MOM, AGW, SDN, and INP controllers) run within the control room subnet. Note that also other services may run in the control room subnet along with them. Typical examples are the Manufacturing Execution System (MES) and the Enterprise Resource Planning (ERP).

3. Requirements

The transition from traditional to modern industrial environments raised critical communications challenges exposed in Section 2. In this regard, it is worth remarking that industrial machines typically have long lifetimes (decades), high costs (millions of USD), and restrictive manufacturers' policies in place (e.g., to prevent firmware updates). Accordingly, the communications substrate should face such challenges by fulfilling additional requirements.

First, non-mission-critical and mission-critical traffic should be distinguished. Typically, non-mission-critical flows (e.g., monitoring of vibrations) are more massive than mission-critical ones (e.g., alerting human operators about dangerous events), thus the former may easily take network resources at the expense of the latter. This requires per-flow traffic management, ranging from flow prioritization (mission-critical flows go first, then non-mission-critical ones) to data aggregation and filtering to reduce the traffic traversing the network. Since the industrial control typically runs cyclically in millisecond level, the control traffic, especially the mission-critical traffic, demands high QoS in terms of latency, jitter, and extremely low packet loss ratio.

Second, the industrial communication demands high reliability. The telecommunication equipment deployed in the Internet typically guarantee the reliability to 99.99%. However, the industrial systems need to be much more reliable, from 99.9999% to 99.99999%, in order to reduce the downtime of the production line. It requires the industrial network to equip extra measures to support it.

Third, machine-to-machine communications should be enabled straightforwardly, notwithstanding the plethora of (proprietary) dialects that coexist at the shop floor level, which enables the interoperability of different shop floor devices. This requires connectors to translate such dialects towards a common one and metadata to express the semantics. Intermediate nodes may use semantics to process packet payloads according to the information they carry. For example, an intermediate node may average a given number of consecutive temperature values (data aggregation) rather than drop values of little application interest (data filtering).

Lastly, machines should keep communicating on the move without affecting overall performance. For example, an automated guided vehicle may move from a shop floor subnet to another. By doing so, the vehicle changes the WiFi access point (i.e., SGW) used to access the network. As a result, the flows sent out by such a vehicle need to be rescheduled accordingly. This requires not only to reconfigure network devices dynamically, but also to do so in compliance with other flows already in place.

In this context, edge computing plays a crucial role in enabling the design and implementation of novel distributed control functions with parts that are hosted on the edge nodes located in the production plant premises and close to the controlled sensors/actuators, primarily to increase reliability and decrease latency. In the following, we discuss a framework for QoS-Enabled Semantic Routing in Industrial Networks capable of synchronizing several entities in a simplified manner via a unique logical configuration interface ("Northbound interface").

4. Principles and Design Guidelines

Future industrial networks will be characterized by an unprecedented degree of heterogeneity and complexity. Traditional solutions, mainly based on the direct interconnection of machines one to each other and machines towards the control room, cannot provide the required degree of flexibility. This leads to exploring novel solutions to manage the deluge of data generated by IIoT devices and provide QoS-driven network (re)configuration.

By considering the momentum of MOM as an enabler of the Industry 4.0 vision, we believe it will become a pillar of future industrial ecosystems. Although it enables critical features to facilitate message dispatching independent from actual machine location, it does not control how packets flow through middle network devices along the routing path. In fact, once a message is sent from a broker to a consumer (or vice versa, from a producer to a broker), the path the message traverses is beyond the MOM's control. However, the ability to dictate the behavior of middle network devices is essential to satisfy stringent QoS requirements. This is where the SDN paradigm comes in.

The SDN controller eases configuration and management of network devices, which act as the (distributed) communication substrate between the machines and the MOM. In addition, the SDN controller provides network-wide abstractions to define and enforce fine-grained network policies.

At the top level, the MOM identifies the destination nodes a message should be dispatched, along with the delivery semantics (e.g., at most once, at least once, or exactly once) to be applied. At the bottom level, AGWs deployed close to machines act as intermediaries between the machines (and the plethora of protocols they speak) and the MOM. In the middle level, the SDN controller exploits its network-wide view to (re)configure the network devices according to the QoS requirements.

Based on the MOM-SDN interplay, network devices can be properly configured:

  • To select the best route towards the destination and forward messages accordingly;
  • To manage competing traffic flows in a coordinated manner, e.g., to ensure prompt dispatching of mission-critical messages even if at the expense of less critical messages;
  • To enforce INP for traffic optimization, e.g., by merging consecutive packets in a single one.

For example, by considering two traffic flows between the MOM broker and a machine, proper routing table management allows to forward traffic flows tagged as "mission-critical" via a large-bandwidth low-latency path (if available). Besides, traffic flows tagged as "not-urgent" may be delayed, where the magnitude of the imposed delay may also depend on the current level of network saturation. Finally, an INP-enabled network device may exploit the semantics about the carried data to provide content-based message management. For instance, it is possible to forward packets only if they satisfy a given rule, e.g., if they carry temperature values greater than a given threshold, or to apply functions to send pre-processed values, e.g., sending only one packet with the average temperature resulting from a series of received temperature values. Note that content-based message management enables decisions on what is carried within packet payloads rather than only on packet headers (mere forwarding). However, since payload inspection and manipulation may introduce additional delays, content-based message management should be enforced as much as possible but without burdening mission-critical traffic flows.

From a functional point of view, the INP level sits atop the data forwarding level. As in the case of SDN deployment, we do not argue that all the network devices should be INP-enabled. Instead, we promote a pragmatic approach where legacy and novel solutions cooperate effectively. Since the SDN controller holds a network-wide view, it knows which network devices offer INP and which do not. Therefore, traffic can be optimally handled by maximizing INP (e.g., routing of packets carrying values that can be averaged towards network devices providing that aggregation function) while ensuring QoS requirements.

5. Architecture

The proposed architecture, mostly working at the application layer, adopts the typical SDN approach by identifying two main areas: Control Plane and Data Plane. In the Control Plane, the following components are deployed: the MOM controller, interacting with the MOM broker; the In-Network Processing (INP) controller, managing the INP units; the SDN controller, controlling network elements; and the Gateway controller, managing the many application gateways deployed in the environment. The Data Plane consists of the implementation of the MOM, the INP units, the SDN-enabled network elements, and the Gateway components.

                               PROTO E
      +-------------+-------------+---------------------+     NORTHBOUND
      |             |             |                     |          IFACE
      v             v             v                     v
+-----+-----+ +-----+-----+ +-----+-----+         +-----+-----+
|  GATEWAY  | |    SDN    | |    INP    |         |    MOM    |  CONTROL
|CONTROLLER | |CONTROLLER | |CONTROLLER |         |CONTROLLER |    PLANE
+-----+-----+ +-----+-----+ +-----+-----+         +-----+-----+
      ^             ^             ^                     ^
      |PROTO A      |PROTO B      |PROTO C       PROTO D|     SOUTHBOUND
      |             |             |                     |         IFACES
      v             |             v                     v
+-----+-----+       |       +-----+-----+         +-----+-----+
|  GATEWAY  |       |       | INP UNIT  |         |    MOM    |
+-----------+       v       +-----------+         +-----------+     DATA
+-------------------+-----------------------------------------+    PLANE
|                                SDN                          |
+-------------------------------------------------------------+
Figure 2: Functional/layered view of the SDN-MOM distributed architecture.

Each component has different duties and responsibilities:

  • The MOM Controller is demanded to control and re-route the traffic flowing into the MOM topics. It uses information coming from the northbound interface and returns back control messages for the SDN controller. It also performs decisions based on the message headers and on the information received from the SDN Controller and the Gateway Controller. The messages can be forwarded to a specific topic, duplicated among different topics, or consumed and pulled out from the flow. At the same time, the MOM Controller issues information that will be used from the SDN controller to correctly configure the SDN devices for achieving the desired level of QoS on the specific output flow.
  • The Message-Oriented Middleware (MOM) is one of the core pieces of our infrastructure. It is the logical single point of communication between several firm sectors. It contains topics written by the Gateways and can be read by multiple other Gateways, based on the plant communication requirements. The MOM is responsible for guaranteeing differentiated QoS policies with different semantics. Typically, the at-most-once semantic can be used for best-effort machinery traffic. Otherwise, at-least-once semantic can be used for monitoring mission-critical assets and for controlling traffic. Moreover, some messages can be sent with high priority, guarantying differentiated traffic management and avoiding congestion.
  • The SDN Controller centralizes network intelligence in a separate component, disassociating the packet forwarding process from the routing processes. The SDN Control Plane consists of one or more controllers that are considered as the brain of the network, where all intelligence is embedded. The SDN Controller configures the network resources. In our infrastructure, the SDN Controller has full knowledge of the network and the paths, guarantying a fine-grained ruling of the traffic coming from the Gateways. Differentiated policies can be applied based on the content of the messages, following the received northbound rules. The traffic can be duplicated, aggregated, blocked, forwarded, and re-routed on different data paths.
  • The Gateway Controller emits control messages directed to the Gateways of the infrastructure. It works in strict coordination with the MOM Controller and SDN Controller, to avoid congestions and to maintain topic abstractions coherent with the real machine distribution. Its management duties comprehend: checking of the state of all the gateways, which must be configured coherently with the machine on which are acting; synchronization with SDN and MOM controllers, that can send re-configuration messages to avoid congestion. Practically, it can manage the header that is applied from the Gateways to each packet, modify the priority of the messages, and define levels of QoS applied directly to the data-extraction phase.
  • The Gateways duties comprehend the data gathering, the data transformation to an internal MOM-specific representation, the header addition, and the interconnection between the industrial machinery world and the MOM topic-centric world. In industrial scenarios, it is common to have machines that use different languages and protocols for data exporting and representation (e.g. Modbus, Profibus, OPC UA, OMG DDS, EtherCAT). For this purpose, the Gateways can be specialized with ad-hoc libraries and push or pull strategies based on the specific machinery from which to gather information. Moreover, the QoS can be managed directly at this level, avoiding high useless throughput when the plant is working in a normal condition.
  • The INP Controller is demanded to control the INP elements of the platform. Its duties comprehend synchronization between INP units, deployment of the correct function for the specific INP unit input flow, management of the QoS on the INP components.
  • The INP Units are hardware/software components demanded to reduce or pre-process the data in ingress. The flows are manipulated accordingly to INP Controller rules and the typical map/reduce functions can be applied in the flow.

Figure 2 depicts a schematic of the entire infrastructure. Dashed paths between controller entities in the control plane (Protocol E), and between control and data planes represent the management/configuration data exchanges that are logically separate from data flows (Protocols A, B, C, D). Data flows start from the Gateways (connected to the machinery via the machine-specific protocols) and are sent through the SDN Component, which traverses the entire platform.

The proposed platform can be seen as an integration of several software architectures in a unique system capable of interacting with them in a uniform and controlled way. In this draft, we omit our specific implementation of each protocol, and we ask the RFC community for possible implementations capable of satisfying each step necessities and requirements. Although certain interfaces can be easily implemented using standard de facto protocols, for instance, Protocol B can be found in to Open Networking Foundation, "OpenFlow Switch Specification", Version 1.5.1, October 2015, https://opennetworking.org/wp-content/uploads/2014/10/openflow-switch-v1.5.1.pdf, and Protocol C can be The P4 Language Consortium, "P416 Language Specification", Version 1.2.1, June 2020, https://p4.org/p4-spec/docs/P4-16-v1.2.1.html, the others interfaces remain open issues and must be implemented as ad-hoc solutions.

6. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

7. Security Considerations

While this Internet Draft is not primarily focused on addressing security issues, it is of paramount importance to provide some security considerations. In particular note that since the proposed solution should be adopted in industrial environments, possible security threats could cause not only issues related to the IT domain, such as service unavailability and data leak, but also to the OT domain, thus also including potential impact to the safety of human operators. To this purpose, we consider of paramount importance (and push for) the adoption of best practices in terms of security and safety of industrial environments and thus we advise the application of the IEC 62443 family standard as a prerequisite for the deployment of the proposed solution. In addition, by focusing on the proposed solution we recognize that while it is suitable to maximize the QoS of higher priority industrial applications, it should not be achieved to the total detriment of lower priority industrial applications, whose packets should be anyway delivered.

8. IANA Considerations

This document has no IANA actions.

9. References

9.1. Normative References

[I-D.draft-farrel-irtf-introduction-to-semantic-routing-03]
Farrel, A. and D. King, "An Introduction to Semantic Routing", Work in Progress, Internet-Draft, draft-farrel-irtf-introduction-to-semantic-routing-03, , <https://datatracker.ietf.org/doc/html/draft-farrel-irtf-introduction-to-semantic-routing-03>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://doi.org/10.17487/RFC2119>.
[RFC7426]
Haleplidis, E., Ed., Pentikousis, K., Ed., Denazis, S., Hadi Salim, J., Meyer, D., and O. Koufopavlou, "Software-Defined Networking (SDN): Layers and Architecture Terminology", RFC 7426, DOI 10.17487/RFC7426, , <https://doi.org/10.17487/RFC7426>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://doi.org/10.17487/RFC8174>.

9.2. Informative References

[BELLAVISTA2018]
Bellavista, P., Dolci, A., and C. Giannelli, "MANET-oriented SDN: Motivations, Challenges, and a Solution Prototype", DOI 10.1109/wowmom.2018.8449805, 2018 IEEE 19th International Symposium on "A World of Wireless, Mobile and Multimedia Networks" (WoWMoM), , <https://doi.org/10.1109/wowmom.2018.8449805>.
[BELLO2020]
Bello, L., Lombardo, A., Milardo, S., Patti, G., and M. Reno, "Experimental Assessments and Analysis of an SDN Framework to Integrate Mobility Management in Industrial Wireless Sensor Networks", DOI 10.1109/tii.2020.2963846, IEEE Transactions on Industrial Informatics Vol. 16, pp. 5586-5595, , <https://doi.org/10.1109/tii.2020.2963846>.
[IEC62443]
International Electrotechnical Commission, "IEC 62443: Industrial network and system security".
[KAUR2018]
Kaur, K., Garg, S., Aujla, G., Kumar, N., Rodrigues, J., and M. Guizani, "Edge Computing in the Industrial Internet of Things Environment: Software-Defined-Networks-Based Edge-Cloud Interplay", DOI 10.1109/mcom.2018.1700622, IEEE Communications Magazine Vol. 56, pp. 44-51, , <https://doi.org/10.1109/mcom.2018.1700622>.
[LI2018]
Li, X., Li, D., Wan, J., Liu, C., and M. Imran, "Adaptive Transmission Optimization in SDN-Based Industrial Internet of Things With Edge Computing", DOI 10.1109/jiot.2018.2797187, IEEE Internet of Things Journal Vol. 5, pp. 1351-1360, , <https://doi.org/10.1109/jiot.2018.2797187>.
[NATESHA2021]
V, N. and R. Guddeti, "Fog-Based Intelligent Machine Malfunction Monitoring System for Industry 4.0", DOI 10.1109/tii.2021.3056076, IEEE Transactions on Industrial Informatics Vol. 17, pp. 7923-7932, , <https://doi.org/10.1109/tii.2021.3056076>.
[PORTS2019]
Ports, D. and J. Nelson, "When Should The Network Be The Computer?", DOI 10.1145/3317550.3321439, Proceedings of the Workshop on Hot Topics in Operating Systems, , <https://doi.org/10.1145/3317550.3321439>.
[RFC5870]
Mayrhofer, A. and C. Spanring, "A Uniform Resource Identifier for Geographic Locations ('geo' URI)", DOI 10.17487/rfc5870, RFC Editor report, , <https://doi.org/10.17487/rfc5870>.
[ZILBERMAN2019]
Zilberman, N., "In-Network Computing", , <https://www.sigarch.org/in-network-computing-draft/>.

Acknowledgments

TODO acknowledge.

Authors' Addresses

Paolo Bellavista
University of Bologna
Luca Foschini
University of Bologna
Lorenzo Patera
University of Bologna
Mattia Fogli
University of Ferrara
Carlo Giannelli
University of Ferrara
Cesare Stefanelli
University of Ferrara
David Zhe Lou
Huawei