DetNet Shaofu. Peng
Internet-Draft ZTE
Intended status: Standards Track Peng. Liu
Expires: 6 January 2024 China Mobile
Kashinath. Basu
Oxford Brookes University
Aihua. Liu
ZTE
Dong. Yang
Beijing Jiaotong University
Guoyu. Peng
Beijing University of Posts and Telecommunications
5 July 2023
Timeslot Queueing and Forwarding Mechanism
draft-peng-detnet-packet-timeslot-mechanism-03
Abstract
IP/MPLS networks use packet switching (with the feature store-and-
forward) and are based on statistical multiplexing. Statistical
multiplexing is essentially a variant of time division multiplexing,
which refers to the asynchronous and dynamic allocation of link
timeslot resources. In this case, the service flow does not occupy a
fixed timeslot, and the length of the timeslot is not fixed, but
depends on the size of the packet. Statistical multiplexing has
certain challenges and complexity in meeting deterministic QoS, and
its delay performance is dependent on the the used queueing
mechanism. This document further describes a generic time division
multiplexing scheme in IP/MPLS networks, which we call timeslot
queueing and forwarding (TQF) mechanism. It aims to bring timeslot
resources to layer-3, to make it easier for the control plane to
calculate the delay performance based on the deterministic resources,
and also make it easier for the data plane to create more flexible
timeslot mapping.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Peng, et al. Expires 6 January 2024 [Page 1]
Internet-Draft Timeslot Queueing and Forwarding July 2023
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 6 January 2024.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6
3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1. Timeslot Resource Reservation in Control-plane . . . . . 9
3.1.1. Timeslot Mapping Relationship . . . . . . . . . . . . 11
3.1.1.1. Deduced by Single Timeslot Mapping Detection . . 11
3.1.1.2. Deduced by Phase Difference of Orchestration
Period . . . . . . . . . . . . . . . . . . . . . . 13
3.1.2. Timeslot Resource Definition . . . . . . . . . . . . 14
3.1.3. Arrival Postion in the Orchestration Period . . . . . 15
3.1.4. Proccess of Each Reservation Sub-task . . . . . . . . 17
3.1.4.1. Resource Reservation on the Ingress Node . . . . 19
3.1.4.2. Resource Reservation on the Transit Node . . . . 20
3.1.4.3. Resource Reservation on the Egress Node . . . . . 21
3.1.4.4. End-to-end Delay and Jitter . . . . . . . . . . . 22
3.2. Timeslot Resource Access in Data-plane . . . . . . . . . 22
3.2.1. Conversion of Timeslot ID . . . . . . . . . . . . . . 23
4. Global Timeslot ID . . . . . . . . . . . . . . . . . . . . . 25
5. Summary of Timeslot Style . . . . . . . . . . . . . . . . . . 27
6. In-time Scheduling . . . . . . . . . . . . . . . . . . . . . 28
7. Queue Design . . . . . . . . . . . . . . . . . . . . . . . . 28
7.1. Queue Design of On-time Scheduler . . . . . . . . . . . . 28
7.1.1. Full Queues . . . . . . . . . . . . . . . . . . . . . 29
7.1.2. Non-full Queues . . . . . . . . . . . . . . . . . . . 29
7.2. Queue Design of In-time Scheduler . . . . . . . . . . . . 29
Peng, et al. Expires 6 January 2024 [Page 2]
Internet-Draft Timeslot Queueing and Forwarding July 2023
8. Multiple Orchestration Periods . . . . . . . . . . . . . . . 29
9. Admission Control on the Headend . . . . . . . . . . . . . . 31
10. Frequency Synchronization . . . . . . . . . . . . . . . . . . 33
11. Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . 33
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35
13. Security Considerations . . . . . . . . . . . . . . . . . . . 35
14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 35
15. References . . . . . . . . . . . . . . . . . . . . . . . . . 35
15.1. Normative References . . . . . . . . . . . . . . . . . . 35
15.2. Informative References . . . . . . . . . . . . . . . . . 36
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 36
1. Introduction
IP/MPLS networks use packet switching (with the feature store-and-
forward) and are based on statistical multiplexing. The discussion
of supporting multiplexing in the network was first seen in the time
division multiplexing (TDM), frequency division multiplexing (FDM)
and other technologies of telephone communication network (using
circuit switching). Statistical multiplexing is essentially a
variant of time division multiplexing, which refers to the
asynchronous and dynamic allocation of link resources. In this case,
the service flow does not occupy a fixed timeslot, and the length of
the timeslot is not fixed, but depends on the size of the packet. In
contrast, synchronous time division multiplexing means that a
sampling frame (or termed as time frame) includes a fixed number of
fixed length timeslots, and the timeslot at a specific position is
allocated to a specific service. The utilization rate of link
resources in statistical multiplexing is higher than that in
synchronous time division multiplexing. However, if we want to
provide deterministic end-to-end delay in packet switched networks
based on statistical multiplexing, the difficulty is greater than
that in synchronous time division multiplexing. The main challenge
is to obtain a deterministic upper bound on the queueing delay, which
is closely related to the queueing mechanism used in the network.
Peng, et al. Expires 6 January 2024 [Page 3]
Internet-Draft Timeslot Queueing and Forwarding July 2023
In addition to IP/MPLS network, other packet switched network
technologies, such as ATM, also discusses how to provide
corresponding transmission quality guarantee for different service
types. Before service communication, ATM needs to establish a
connection to reserve virtual path/channel resources, and use fixed-
length short cells and timeslots. The advantage of short cell is
small interference delay, but the disadvantage is low encoding
efficiency. The mapping relationship between ATM cells and timeslots
is not fixed, so it still depends on a specific cells scheduling
mechanism (such as [ATM-LATENCY]) to ensure delay performance.
Although the calculation of delay performance based on short and
fixed-length cells is more concise than that of IP/MPLS networks
based on non-fixed-length packets, they all essentially depend on the
queueing mechanism.
[TAS] introduces a synchronous time-division multiplexing method
based on gate control list (GCL) rotation in Ethernet LAN. Its basic
idea is to calculate when the packets of the service flow arrive at a
certain node, then the node will turn on the green light (i.e., the
transmission state is set to OPEN) for the corresponding queue
inserted by the service flow at that time duration, which is defined
as TimeInterval between two adjacent items in gating cycle. The
TimeInterval is exactly the timeslot resource that can be reserved
for service flow. A set of queues is controlled by the GCL, with
round robin per gating cycle. The gating cycle (e.g, 250 us)
contains a lot of items, and each item is used to set the OPEN/CLOSED
states of all traffic class queues. By strictly controlling the
release time of service flow at the network entry node, multiple
flows always arrive sequentially during each gating cycle at the
intermediate node and are sent during their respective fixed timeslot
to avoid conflicts, with extremely low queueing delay and cut-through
behavior. However, the GCL state (i.e., items set, and different
TimeInterval value between any two adjacent items) is related with
all ordered flows that passing through the node. Calculating and
installing GCL states separately on each node has scalability issues.
[CQF] introduces a synchronous time-division multiplexing method
based on fixed-length cycle in Ethernet LAN. [Multi-CQF] is a
further enhancement of the classic CQF and may be applicable to large
scaling networks. CQF with 2-buffer mode or Multi-CQF with 3-buffer
mode only uses a small number of cycles to establish the cycle
mapping between a port-pair of two adjacent nodes, which is
independent of the individual service flow. The cycle mapping may be
maintained on each node and swaped based on a single cycle id carried
in the packet during forwarding ([I-D.eckert-detnet-tcqf]), or all
cycle mappings are carried in the packet as a cycle stack and read
per hop during forwarding
([I-D.chen-detnet-sr-based-bounded-latency]). According to
Peng, et al. Expires 6 January 2024 [Page 4]
Internet-Draft Timeslot Queueing and Forwarding July 2023
[Multi-CQF], how many cycles (i.e., x-buffer mode) are required
depends on the proportion of the variation in intra-node forwarding
delay relative to the cycle size. If the proportion is small,
3-buffer is enough, otherwise, more than 3 output buffers needed.
Compared to TAS, CQF/Multi-CQF no longer maintains GCL on each node,
but instead replaces the large number of variable length of timeslots
related to service flows in GCL with a small number of fixed length
cycles unrelated to service flows. Thus, CQF/Multi-CQF simplifies
the data plane, but leaves the complexity to the control plane, by
calculating and controling the release time of service flow at the
network entry, to guarantee no conflicts between flows in any cycle
on any intermediate nodes.
In order to meet the large scaling requirements, this document
continues to provide a scheduling mechanism for enhancing TAS.
Firstly, it brings timeslot type of resources to layer-3 and
construct timeslot resources on each link within gating cycle, which
are advertised in the network and open and reserved for service
flows. Secondly, it defines timeslot based queueing mechanism on the
data plane with on-time or in-time behavior. We call this mechanism
as Timeslot Queueing and Forwarding (TQF). The selected length of
gating cycle depends on the length of the supported service burst
interval.
Similar to TAS and CQF/Multi-CQF, TQF is also TDM based scheduling
mechanisms.
* Compared to classic TAS, TQF on-time scheduling maintains round
robin queues corresponding to the count of timeslots during gating
cycle, while TAS only maintains queues corresponding to the number
of traffic classes. That means TQF need more queues than TAS.
However, TAS needs to use other complex methods to control the
arrival order of all flows sharing the same traffic class queue to
isolate them (so that each flow faces almost zero queuing delay),
while TQF's timeslot queue naturally isolates flows by timeslot id
of gating cycle. And, TQF in-time scheduling may use a single
PIFO (put in first out) queue to approximate the cut-through
behavior of TAS.
* Compared to CQF/Multi-CQF, TQF on-time scheduling maintains round
robin queues corresponding to the count of timeslots during gating
cycle, while CQF/Multi-CQF maintains extra tolerating queues
depending on the proportion of the variation in intra-node
forwarding delay relative to the cycle size. TQF also need more
queues than CQF/Multi-CQF. Because there is no gating cycle with
its timeslot resources designed by CQF/Multi-CQF, it needs to use
other complex methods to control the arrival order of flows
sharing the same cycle queue to isolate flows, while TQF's
Peng, et al. Expires 6 January 2024 [Page 5]
Internet-Draft Timeslot Queueing and Forwarding July 2023
timeslot queue naturally isolates flows by timeslot id of gating
cycle. This is also the semantic difference between cycle id and
timeslot id, where the former is used to indicate the NO. of the
aggregated queues such as sending, receiving, or tolerating queue,
rather than indicating the individual timeslot resource within the
gating cycle like the later. That is, after defining timeslot
resources in IP/MPLS, TQF does not limit the implementations of
the data structure type corresponding to timeslot resources on the
forwarding plane, which may be round robin queues, or a single
PIFO queue.
2. Terminology
The following terminology is introduced in this document:
Timeslot: The smallest unit of TQF scheduling. It needs to design a
reasonable value, such as 10us, to send at least one complete
packet. Different nodes can be configured with different length
of timeslot.
Timeslot Scheduling: The packet is stored in the queue corresponding
to a specific timeslot id, then sent in that timeslot. The
timeslot id is always a NO. of orchestration period.
Service Burst Interval: The traffic specification of deterministic
services generally follows the principle of generating a specific
burst amounts within a specific length of cyclic burst interval.
For example, a service generates 1000 bits of burst per 1 ms,
where 1 ms is the service burs interval.
Orchestration Period: The orchestration period is actually the
gating cycle in TAS, and its length depends on the length of the
service burst interval of all deterministic flows. It contains a
fixed count (termed as N and numbered from 0 to N-1) of
timeslots. For example, the orchestration period include 1000
timeslots and each timeslot length is 10 us. The timeslot
resources within the orchestration period can be allocated for
services, i.e., which timeslots are occupied by services and how
many bits are occupied in a timeslot. The orchestration period
is the Least Common Multiple of all service burst intervals. It
is also a multiple of the scheduling period. It is recommended
that all nodes of the network be configured with the same length
of orchestration period (note that timeslot length may still be
different), because it is service-related and also crucial for
establishing a stable timeslot mapping relationship.
Ongoing Sending Period: The orchestration period which the ongoing
sending timeslot belongs to.
Peng, et al. Expires 6 January 2024 [Page 6]
Internet-Draft Timeslot Queueing and Forwarding July 2023
Scheduling Period: The scheduling period may be equal to
orchestration period, or a fraction of orchestration period. It
reflects the count of the timeslot queues that is actually
instantiated on the forwarding plane, which is limited by
hardware capabilities. It contains a fixed count (termed as M
and numbered from 0 to M-1) of timeslots. For example, the
scheduling period include 100 timeslots (i.e., 100 timeslot
queues are instantiated) and each timeslot length is 10 us.
Different nodes can be configured with different length of
scheduling period. When the orchestration period is greater than
the scheduling period, different parts of the orchestration
period can be mapped to a single scheduling period using
appropriate mapping methods.
Incoming Timeslot: For an intermediate node in a specific path, the
timeslot contained in the packet received from the upstream node
(i.e., the outgoing timeslot of the upstream node) is its
incoming timeslot. An incoming timeslot is the timeslot id in
the orchestration period.
Outgoing Timeslot: For an intermediate node in a specific path, when
it continues to send packets received from the upstream node to
downstream nodes, according to resource reservation or certain
rules, it chooses to send packets in the specified timeslot,
which is the outgoing timeslot. An outgoing timeslot is the
timeslot id in the orchestration period.
Ongoing Sending Timeslot: For the headend of the path, packets
received from the client side and sent to the downstream node.
When the packet reaches the outgoing port, the timeslot currently
in the sending state is the ongoing sending timeslot; For
intermediate nodes of the path, packets received from the
upstream node and sent to the downstream node. When the end of
the incoming timeslot to which the packet belongs reaches the
outgoing port, the timeslot currently in the sending state is the
ongoing sending timeslot. Note that the ongoing sending timeslot
is different with the outgoing timeslot. An ongoing sending
timeslot is the timeslot id in the orchestration period.
3. Overview
This scheme introduces the time-division multiplexing scheduling
mechanism based on the fixed length timeslot in the IP/MPLS network.
Note that the time-division multiplexing here is a L3 packet-level
scheduling mechanism, rather than the TDM port (such as SONET/SDH)
implemented in L1. The latter generally involves the time frame and
the corresponding framing specification, which is not necessary in
this document. The data structure associated with timeslot resources
Peng, et al. Expires 6 January 2024 [Page 7]
Internet-Draft Timeslot Queueing and Forwarding July 2023
may be implemented using round robin queues, or a single PIFO queue,
etc.
Figure 1 shows the TQF scheduling behavior implemented by the
intermediate node P through which multiple deterministic paths passes
on to the outgoing port (P-PE2).
+---+ +---+ +---+
|PE1| --------------- | P | --------------- |PE2|
+---+ +---+ +---+
orchestration period
+---+---+-+-+---+---------+---+
| 0 | 1 | 2 | 3 | ... ... |N-1|
+---+---+---+---+---------+---+
^ ^
reserve slots: | | reserve slots:
a,b,c | | x,y
path-1 -------------------------o--|---------------->
path-2 -------------------------|--o---------------->
| |
access slots: | | access slots:
a',b',c' v v x',y'
/ +-------------------+ ___
| | queue-0 @slot 0 | / \
| +-------------------+ | |
| | queue-1 @slot 1 | | |
Scheduling < +-------------------+ |
Period | | ... ... | | ^
| +-------------------+ | |
| | queue-n @slot M-1| \___/
\ +-------------------+
Figure 1
Where, both the orchestration period and the scheduling period
consist of multiple timeslots. The count of timeslots supported by
the orchestration period is related to the length of the service
burst interval, while the count of timeslots supported by the
scheduling period is limited by hardware capabilities. The total
amount of bits that can be reserved or sent in each timeslot can be
preset, generally not exceeding the result of the service rate
multiplied by the timeslot length. Note that the TQF scheduler may
config a specific service rate.
Peng, et al. Expires 6 January 2024 [Page 8]
Internet-Draft Timeslot Queueing and Forwarding July 2023
The orchestration period of all nodes in the network does not need to
be synchronized, and phase difference is allowed. For each node, the
phase of timeslot of orchestration period and the scheduling period
are strictly aligned. This is indeed natural because multiple
scheduling periods forms an orchestration period. In other words,
different parts of the orchestration period share and reuse the same
scheduling period. The figure shows round robin queues associated
with the scheduling period.
In the figure, path-1 and path-2 allocate timeslot resource from the
orchestration period of link P-PE2 respectively. Path-1 reserves
timeslot a, b, c from orchestration period, and finally accesses
timeslot a', b', c' from scheduling period. Path-2 reserves timeslot
x, y from orchestration period, and finally accesses timeslot x', y'
from scheduling period. There is a mapping relationship function
between the timeslot i of orchestration period and the timeslot i' of
scheduling period, i.e., i' = f(i). There are many mapping options,
such as a'=a, a'=a+offset, a'=a%M, and a'=random(a), etc. Which
option to use depends on the specific resource reservation method.
Section 3.2.1 describes one of the options.
In general, TQF mechanism implemented on all nodes in the network may
use the same length of timeslot and scheduling period. However,
considering the capability differences of each node in the network
(for example, the capabilities of the edge nodes are weaker than the
core nodes), it is feasible for different nodes/links to use
different length of timeslot and scheduling period.
The scheme involves two aspects: the path calculation and timeslot
resource reservation in the control plane, and timeslot resource
access in the data plane.
3.1. Timeslot Resource Reservation in Control-plane
The control plane (centralized controller or distributed protocol)
can reserve corresponding timeslot resources along the deterministic
path. Note that a path may carry multiple service flows, then the
path may reserve timeslot resources for the aggregated service flow,
and may reserve the burst resources in multiple timeslots in the
orchestration period at the same time. However, it would still be
beneficial to distinguish between reservation sub-tasks corresponding
to different service flows in the combined reservation task. In this
document, we refer to a reservation sub-task as an individual
timeslot resource reservation action related to a service flow. Note
that one or more reservation sub-tasks for a specific service flow
may be derived based on its TSpec, and each reservation sub-task will
allocate corresponding timeslot. The intermediate nodes do not
maintain the state of service flow and only reserve timeslot
Peng, et al. Expires 6 January 2024 [Page 9]
Internet-Draft Timeslot Queueing and Forwarding July 2023
resources based on the reservation sub-tasks.
During resource reservation, it is necessary to distinguish the
requirements between low latency service and non-low latency service
. For low latency service requirements, the physical offset between
the reserved outgoing timeslot and the incoming timeslot is small;
while for non-low latency service requirements, this physical offset
can be large. It is necessary to maintain the end-to-end total
residence delay budget for each reservation sub-task, used to select
outgoing timeslot, as long as the sum of residence delays caused by
all nodes should not exceed the total residence delay budget.
Multiple reservation sub-tasks may generate different incoming/
outgoing timeslot mapping relationships on node P. For example:
* The timeslot mapping relationship created by the sub-task-1:
<(incoming port a, incoming slot id 3), (outgoing port b,
outgoing slot id 60)>
* The timeslot mapping relationship created by the sub-task-2:
<(incoming port a, incoming slot id 3), (outgoing port b,
outgoing slot id 61)>
Special care should be taken not to confuse the use of different
mapping relationships. For specific service flows, P need to
explicitly use specific timeslot mapping relationships.
It is recommended, but not mandatory, to reserve timeslot resources
on the outgoing port of each hop from the headend of the path to the
endpoint, that is, first determine the timeslot reserved on the
headend, then determine the timeslot reserved on the next hop , and
so on. We assume that the service flow has a periodic arrival time,
and there is a fixed position relationship between the arrival time
and the orchestration period of the headend, so selecting the
outgoing timeslot closed to the arrival time or within the expected
offset range in the orchestration period can minimize the residency
delay of the packet on the headend. However, sometimes it is
necessary to get a larger residence delay on the headend and a
smaller residence delay on other nodes to ensure successful path
calculation.
Peng, et al. Expires 6 January 2024 [Page 10]
Internet-Draft Timeslot Queueing and Forwarding July 2023
3.1.1. Timeslot Mapping Relationship
In order to reserve outgoing timeslot resources for the service flow
, it is necessary to first determine the ongoing sending timeslot
that the incoming timeslot falls into, i.e., the mapping relationship
between the incoming timeslot and the ongoing sending timeslot.
Suppose a path contains three nodes U, V, and W in turn along the
path. All nodes are configured with orchestration period of the same
length (termed as LOP), which is crucial for establishing a fixed
timeslot mapping relationship. Node U config timeslot length L_u,
and an orchestration period contains N_u timeslots. Node V config
timeslot length L_v, and an orchestration period contains N_v
timeslots. In general, the link bandwidth of edge nodes is small,
and they will be configured with a larger timeslot length than the
aggregated/backbone nodes.
It has been mathematically proven that if the least common multiple
of Lu and Lv is LCM, LOP is also a multiple of LCM.
Two methods are provided in the following sub-sections to determine
the mapping relationship between the incoming timeslot and the
ongoing sending timeslot.
3.1.1.1. Deduced by Single Timeslot Mapping Detection
Figure 2 shows that Node U sends a detection packet from the end (or
head, the process is similar) of an arbitrary timeslot i on the
outgoing port connected to node V. After a certain link propagation
delay (D_propagation), the packet is received by the incoming port of
node V, and i is regarded as the incoming timeslot by V. The packet
finally arrives at the outgoing port connected to node W after the
intra-node forwarding delay (D_forwarding) including parsing, table
lookup, internal fabric exchange, etc. At this time, the ongoing
sending timeslot is j, and there is time T_ij left before the end of
the timeslot j.
Peng, et al. Expires 6 January 2024 [Page 11]
Internet-Draft Timeslot Queueing and Forwarding July 2023
|<------------------------ LOP ---------------------------->|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
U | | | | i | | | | | | x | | | | | |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| |
|<--T_ij->| |<--T_xy->|
v v
+-----------+-----------+-----------+-----------+-----------+
V | | j | ... ... | y | |
+-----------+-----------+-----------+-----------+-----------+
|<------------------------ LOP ---------------------------->|
Figure 2
Then, based one the detection result of the mapping relationship
between incomming timeslot i and ongoing sending timeslot j, for any
other outgoing timeslot x of node U, the mapped ongoing sending
timeslot y of node V is:
* y = (j + ((N_u+x-i)*L_u-T_ij)/L_v + 1) % N_v
And the time Txy left before the end of the timeslot y is:
* T_xy = L_v - ((N_u+x-i)*L_u-T_ij)%L_v
Note that the detection message used to get the mapping relationship
i->j does not really need to be sent to the outgoing port, i.e., the
mapping relationship cannot be obtained only on the outgoing port,
but on the incoming port side. Assuming that the orchestration
period of all ports within a node are strictly synchronized, this is
easy to achieve. On the incoming port, upon receiving the detection
message, immediately determine the ongoing sending timeslot j' that
the incoming timeslot falls into and the corresponding T_ij', and
then based on a fixed forwarding delay evaluation value (but not less
than the actual forwarding delay D_forwarding) to estimate the
timeslot j that the incoming timeslot falls into and the
corresponding T_ij.
Peng, et al. Expires 6 January 2024 [Page 12]
Internet-Draft Timeslot Queueing and Forwarding July 2023
3.1.1.2. Deduced by Phase Difference of Orchestration Period
Figure 3 shows that Node U sends a detection packet from the end (or
head, the process is similar) of the orchestration period on the
outgoing port connected to node V. After a certain link propagation
delay (D_propagation), the packet is received by the incoming port of
node V and finally arrives at the outgoing port connected to node W
after the intra-node forwarding delay (D_forwarding). At this time,
there is time P_uv left before the end of the ongoing sending period.
|<------------------- LOP --------------------->|
+---+---+---+---+---+---+---+---+---+---+---+---+
U | | | | | | | x | | | | | |
+---+---+---+---+---+---+---+---+---+---+---+---+
| |
|<----Puv---->| |<--T_xy->|
| v
+-----------+-----------+-----------+-----------+
V | | y | | |
+-----------+-----------+-----------+-----------+
|<------------------- LOP --------------------->|
Figure 3
Then, based one the phase difference of orchestration period, for any
outgoing timeslot x of node U, the mapped ongoing sending timeslot y
of node V is:
* y = ((LOP+(x+1)*L_u-P_uv)/L_v) % N_v
And the time Txy left before the end of the timeslot y is:
* T_xy = L_v - (LOP+(x+1)*L_u-P_uv)%L_v
Note that the detection message used to get the phase difference of
orchestration period does not really need to be sent to the outgoing
port, i.e., the phase difference cannot be obtained only on the
outgoing port, but on the incoming port side. Assuming again that
the orchestration period of all ports within a node are strictly
synchronized, on the incoming port, upon receiving the detection
message, immediately determine the phase difference P_uv', and then
based on a fixed forwarding delay evaluation value (but not less than
the actual forwarding delay D_forwarding) to estimate the phase
difference P_uv.
Peng, et al. Expires 6 January 2024 [Page 13]
Internet-Draft Timeslot Queueing and Forwarding July 2023
Note that in Section 3.1.1.1, the phase difference of orchestration
period may also be derived firstly by the mapping relationship of
i->j, and then get the mapping relationship for other timeslots
according to the above formula.
3.1.2. Timeslot Resource Definition
The timeslot resources of a link can be represented as the
corresponding bit amounts of all timeslots included in an
orchestration period. Basically, the link capability should contain
the following information:
* Timeslot Length (L_T): Represents the length of the timeslot, in
units of us. Generally, the length of each timeslot included in
the orchestration period is the same.
* Length of Orchestration Period (LOP): Indicates the number of
timeslots (N) included in the orchestration period, numbered
sequentially from 0 to N-1.
* Length of Scheduling Period (LSP): Indicates the number of
timeslots (M) included in the scheduling period, numbered
sequentially from 0 to M-1.
Figure 4 shows the timeslot resource model of the link, with an
orchestration period consisting of N timeslots numbered from 0 to
N-1. The resource information of each timeslot includes the
following attributes:
* Timeslot ID: Indicates the NO. of the timeslot in the
orchestration period. The NO. of the first timeslot is 0, and the
NO. of the last timeslot is N-1.
* Maximum Reservable Bursts (MRB): Refers to the maximum amount of
bit quota corresponding to this timeslot, with unit of bits. It
is a configurable preset value that is related to the service rate
(termed as C) and the length of the timeslot (termed as L_T), then
the Maximum Reservable Bursts should be set to a value not
exceeding C*L_T. Generally, the Maximum Reservable Bursts of each
timeslot included in the orchestration period are all the same.
* Unreserved Bursts (UB): Refers to the amount of bits reservable
corresponding to this timeslot, with unit of bits.
Peng, et al. Expires 6 January 2024 [Page 14]
Internet-Draft Timeslot Queueing and Forwarding July 2023
#N-1 +-------------------------------------+
| Timeslot Length: L_T(n-1) |
| Maximum Reservable Bursts: MRB(n-1) |
| Unreserved Bursts: UB(n-1) |
+-------------------------------------+
... ... ...
... ... ...
#1 +-------------------------------------+
| Timeslot Length: L_T(1) |
| Maximum Reservable Bursts: MRB(1) |
| Unreserved Bursts: UB(1) |
+-------------------------------------+
#0 +-------------------------------------+
| Timeslot Length: L_T(0) |
| Maximum Reservable Bursts: MRB(0) |
| Unreserved Bursts: UB(0) |
+-------------------------------------+
----------------------------------------------------------->
Timeslot Resource of the Link
Figure 4
The IGP/BGP extensions to advertise the link's capability and
timeslot resource is defined in
[I-D.peng-lsr-deterministic-traffic-engineering].
3.1.3. Arrival Postion in the Orchestration Period
Generally, a deterministic service flow has its TSpec, such as
periodically generating traffic of a specific burst size within a
specific length of burst interval, which regularly reaches the
network entry. The headend executes traffic regulation (e.g, setting
appropriate parameters for leaky bucket shaping), which generally
make packets evenly distributed within the service burst interval,
i.e, there are one or more shaped sub-burst in the service burst
interval. There is a fixed positional relationship between the
departure time when each sub-burst leaves the regulator and the
orchestration period, based on that a specific outgoing timeslot is
reserved for the sub-burst. Note that there may be deviation of the
departure time when the sub-burst leaves from the regulator occurs,
that is, there may be deviation in the positional relationship
between it and the orchestration period. Therefore, when reserving
the outgoing timeslot, this deviation should be included (see
Section 9 for more considerations).
Peng, et al. Expires 6 January 2024 [Page 15]
Internet-Draft Timeslot Queueing and Forwarding July 2023
Figure 5 shows, for some typical service flows, the relationship
between the service burst interval (SBI) and the length of
orchestration period (LOP) of headend, as well as the possible
timeslot resource reservation results for these service flows.
|<--------------------- LOP ---------------------->|
+----+----+----+----+----+----+----+----------+----+
| #0 | #1 | #2 | #3 | #4 | #5 | #6 | ... ... |#N-1|
+----+----+----+----+----+----+----+----------+----+
+--+
Service 1: | |b1| |
+-----+--+-----------------------------------------+
|<------------------- SBI ------------------------>|
+--+ +--+
Service 2: | |b1| |b2|
+------------+--+------------------------+--+------+
|<------------------- SBI ------------------------>|
+------+
Service 3: | | b1 |
+---------------------------+------+---------------+
|<------------------- SBI ------------------------>|
+--+ +--+ +--+
Service 4: | |b1| | |b1| | |b1| |
+----+--+--------+----+--+--------+----+--+--------+
|<----- SBI ---->|<----- SBI ---->|<----- SBI ---->|
Figure 5
As shown in the figure, the length of service burst intervals for
services 1, 2, 3 is equal to the length of orchestration period,
while the length of the service burst interval for service 4 is only
1/3 of the orchestration period.
* Service 1 generates a very small single burst amounts within its
burst interval, which may reserve timeslot 2 or other subsequent
timeslot in the orchestration period;
* Service 2 generates two small discrete sub-bursts within its burst
interval and also be shaped, which may reserve slots 4 and N-1 in
the orchestration period for each sub-burst respectively;
Peng, et al. Expires 6 January 2024 [Page 16]
Internet-Draft Timeslot Queueing and Forwarding July 2023
* Service 3 generates a large single burst amount within its burst
interval but not be really shaped (due to purchasing a larger
burst resource and served by a larger bucket depth), which may
also be split to multiple back-to-back sub-bursts and reserve
multiple timeslots in the orchestration period, such as timeslots
8 and 9.
* The length of the service burst interval for service 4 is only 1/3
of the orchestration period, then first construct service 4' whose
burst interval is equal to the length of orchestration period and
contains three times of service 4. So service 4' is similar to
service 2, generating a small amount of three separate sub-bursts
within its burst interval. It may reserve timeslots 3, 7, and N-1
in the orchestration period.
Each sub-burst corresponds to a reservation sub-task. For
simplicity, each regulated sub-burst in the service burst interval
always reserves timeslot resources according to max{sub-bursts}.
For a specific service flow, to determine how many reservation sub-
tasks are required, can be summarized as:
* First, align the service burst interval with the Orchestration
Period of the headend to ensure that the two are of equal length.
If the service burst interval is only a fraction of the
Orchestration Period, then multiply it several times to obtain the
expanded service burst interval to get a new service'.
* Check how many discrete sub-bursts will be generated during the
orchestration Period, and for each sub-burst:
- If the proportion of the sub-burst size to the MRB of a single
timeslot does not exceed a specific value, then the sub-burst
corresponds to a reservation sub-task;
- Otherwise, continue to split the sub-burst into multiple sub-
sub-bursts (note that each sub-sub-burst must contain a
complete packet), so that the proportion of each sub-sub-burst
size to the MRB of a single timeslot does not exceed the
specific value, and each sub-sub-burst corresponds to a
reservation sub-task.
3.1.4. Proccess of Each Reservation Sub-task
Each reservation sub-task contains a separate parameter set, which is
used in the process of timeslot resource reservation. Note that this
set may be a local information for the path compuation engine (e.g, a
controller), or may signal between nodes (e.g, RSVP-TE).
Peng, et al. Expires 6 January 2024 [Page 17]
Internet-Draft Timeslot Queueing and Forwarding July 2023
* Total Residence Budget: It is the sum of the residence delay
allowed by the service flow within all nodes in the path, which is
equal to the end-to-end delay requirement of the service flow
minus the propagation delay of all links included in the path.
* Node Residence Budget: It refers to the resident delay budget of
the current node traversed during the process of reserving
timeslot resources on each node along the path in sequence. A
simple way is to divide the Total Residence Budget by the number
of nodes included in the path to obtain the average resident delay
budget as the Node Residence Budget for each node, or use a
specified budget list to specify the resident delay budget for
each node separately.
* Accumulated Node Residence Budget: It refers to the cumulative
residence delay budget of those nodes that have executed resource
reservation.
* Accumulated Node Residence Evaluation: It refers to the cumulative
evaluation value of the residence delay of nodes that have
executed timeslot resource reservation. The residence delay
evaluation value of a node refers to the residence delay
evaluation value calculated based on the delay formula (see below)
when the node actually reserves a certain outgoing timeslot for
the reservation sub-task. Generally, if a node is able to reserve
the expected outgoing timeslot according to its residence delay
budget, the residence delay evaluation value does not differ from
the residence delay budget. However, in some cases, due to
insufficient resources in the expected timeslot, resources have to
be reserved in the timeslot adjacent to the expected timeslot,
which can lead to a difference between the residence delay
evaluation value and the budget value.
* Accumulated Node Residence Deviation: It is equal to the
Accumulated Node Residence Budget minus the Accumulated Node
Residence Evaluation.
* Node Residence Budget Adjustment: It is equal to the Node
Residence Budget plus the Accumulated Node Residence Deviation.
The usage for the above parameter set is:
* For specific reservation sub-task, determine the Node Residence
Budget for each node in the path, which can be taken from the
average residence delay budget per node or the specified budget
list.
Peng, et al. Expires 6 January 2024 [Page 18]
Internet-Draft Timeslot Queueing and Forwarding July 2023
* From the headend to the endpoint, on each node's outgoing port in
sequence, reserve outgoing timeslot resources based on the Node
Residence Budget Adjustment, to let the residence delay evaluation
value of the node obtained from the reserved outgoing timeslot be
equal to or close to the Node Residence Budget Adjustment.
- On the headend, the Accumulated Node Residence Deviation is the
initial value of 0. Therefore, the Node Residence Budget
Adjustment is equal to the Node Residence Budget.
- On any other nodes, the Accumulated Node Residence Deviation is
generally not 0. If the residence delay evaluation value of
the node obtained from the reserved outgoing timeslot be equal
to the Node Residence Budget Adjustment, it will cause the
Accumulated Node Residence Deviation faced by the downstream
node in the path to be 0 again.
Note that the above parameter set is only an implementation choice
and is not mandatory. There may be more intelligent path calculation
methods available.
3.1.4.1. Resource Reservation on the Ingress Node
On the headend H, as mentioned above, there is a fixed positional
relationship (with possible jitter) between the departure time when
the sub-burst leaves the regulator and the orchestration period.
From the departure time when the sub-burst leaves the regulator,
after the intra-node forwarding delay (d_f) including parsing, table
lookup, internal fabric exchange, etc, the sub-burst finally arrives
at the ougoing port, and at this time the ongoing sending timeslot is
j, and there is time T_j left before the end of the timeslot j.
The outgoing timeslot reserved for the sub-burst by the headend is
offset by o (>=1) timeslots after timeslot j, which means the
outgoing timeslot is (j+o)%N_h, where N_h is the number of timeslots
in the orchestration period for node H.
Note that o must be less than M.
Thus, on the headend H the residence delay evaluation value obtained
from the reserved outgoing timeslot (j+o)%N_h is:
Best Node Residence Evaluation = d_f + T_j + (o-1)*L_h
Worst Node Residence Evaluation = d_f + T_j + o*L_h
Average Node Residence Evaluation = d_f + T_j + (2o-1)*L_h/2
Peng, et al. Expires 6 January 2024 [Page 19]
Internet-Draft Timeslot Queueing and Forwarding July 2023
where, L_h is the length of timeslot for node H.
The Best Node Residence Evaluation occurs when the sub-burst is sent
at the head of outgoing timeslot j+o. The Worst Node Residence
Evaluation occurs when the sub-burst is sent at the end of outgoing
timeslot j+o. The delay jitter within the node is L_h. However, the
jitter of the entire path is not the sum of the jitters of all nodes.
Depending on the implementation, the above Best Node Residence
Evaluation, Worst Node Residence Evaluation, or Average Node
Residence Evaluation can be used to compare with the Node Residence
Budget Adjustment, so that when selecting the appropriate outgoing
timeslot (j+o)%N_h, the two are equal or nearly equal, and the
corresponding Unreserved Burst resources of the outgoing timeslot
(j+o)%N_h meet the burst demand of the sub-burst. However, this
document suggests using the Average Node Residence Evaluation to
compare with the Node Residence Budget Adjustment, because the
characteristic of the forwarding behavior based on TQF is that
adjacent nodes on the path will not simultaneously face the best or
worst residency delay.
3.1.4.2. Resource Reservation on the Transit Node
On the transit node V, as described in Section 3.1.1, there is a
timeslot mapping relationship between the outgoing timeslot of the
upstream node U and the ongoing sending timeslot of node V.
For a specific sub-task, assume that an outgoing timeslot i is
reserved for it on the outgoing port of the upstream node U, and
after the intra-node forwarding delay (d_f) then mapped to the
ongoing sending timeslot j of node V, and there is time T_ij left
before the end of the timeslot j.
The outgoing timeslot reserved for the sub-task by node V is offset
by o (>=1) timeslots after timeslot j, which means the outgoing
timeslot is (j+o)%N_v, where N_v is the number of timeslots in the
orchestration period of node V.
Note that o must be less than M.
Thus, on the transit node V the residence delay evaluation value
obtained from the reserved outgoing timeslot (j+o)%N_v is:
Best Node Residence Evaluation = d_f + T_ij + (o-1)*L_v
Worst Node Residence Evaluation = d_f + T_ij + L_u + o*L_v
Peng, et al. Expires 6 January 2024 [Page 20]
Internet-Draft Timeslot Queueing and Forwarding July 2023
Average Node Residence Evaluation = d_f + T_ij + (L_u+(2o-
1)*L_v)/2
where, L_u and L_v is the length of timeslot for node U and V
respectively.
The Best Node Residence Evaluation occurs when the packet is received
at the end of incoming timeslot i and sent at the head of outgoing
slot j+o; The Worst Node Residence Evaluation occurs when t he packet
is received at the head of incoming timeslot i and sent at the end of
outgoing timeslot j+o. The delay jitter within the node is (L_u +
L_v). However, the jitter of the entire path is not the sum of the
jitters of all nodes.
Depending on the implementation, the above Best Node Residence
Evaluation, Worst Node Residence Evaluation, or Average Node
Residence Evaluation can be used to compare with the Node Residence
Budget Adjustment, so that when selecting the appropriate outgoing
timeslot (j+o)%N_v, the two are equal or nearly equal, and the
corresponding Unreserved Burst resources of the outgoing timeslot
(j+o)%N_v meet the burst demand of the sub-burst. However, this
document suggests using the Average Node Residence Evaluation to
compare with the Node Residence Budget Adjustment, because the
characteristic of the forwarding behavior based on TQF is that
adjacent nodes on the path will not simultaneously face the best or
worst residency delay.
3.1.4.3. Resource Reservation on the Egress Node
Generally, for the deterministic path carrying the service flow, the
flow needs to continue forwarding from the outgoing port of the
egress node to the client side, and also faces the issues of
queueing. However, the outgoing port facing the client side is not
part of the deterministic path. If it is necessary to continue
supporting TQF mechanism on that port, timeslot resources should be
reserved on the higher-level service path (an overlay path) using the
above reservation method. In this case, the deterministic path will
serve as a virtual link of the overlay path, providing a
deterministic delay performance.
Therefore, for deterministic paths, the residence dalay evaluation
value on the egress node is only contributed by the forwarding delay
(d_f) including parsing, table lookup, internal fabric exchange, etc.
Peng, et al. Expires 6 January 2024 [Page 21]
Internet-Draft Timeslot Queueing and Forwarding July 2023
3.1.4.4. End-to-end Delay and Jitter
Figure 6 shows that a path from headend P1 to endpoint E, for each
node Pi, the length of timeslot is L_i, the intra-node forwarding
delay is F_i, the remaining time from the end of the mapped ongoing
sending timeslot is T_i, the number of timeslots offset by outgoing
timeslot relative to ongoing sending timeslot is o_i, then the end to
end delay can be evaluted as follows:
Best E2E Delay = sum(F_i+T_i+o_i*L_i, for 1<=i<=n) - L_n + F_e
Worst E2E Delay = sum(F_i+T_i+o_i*L_i, for 1<=i<=n) + F_e
+---+ +---+ +---+ +---+ +---+
| P1| --- | P2| --- | P3| --- ... --- | Pn| --- | E |
+---+ +---+ +---+ +---+ +---+
Figure 6
The Best E2E Delay occurs when the sub-burst is sent at the head of
outgoing timeslot of node Pn. The Worst E2E Delay occurs when the
sub-burst is sent at the end of outgoing timeslot of node Pn. The
delay jitter is L_n. Note that at the headend P1, regardless of
whether it has the best or worst residence latency, it will be
aligned to the worst latency on the downstream node; Every hop is
like this, except for the last one.
3.2. Timeslot Resource Access in Data-plane
The headend of the path needs to maintain the timeslot resource
information with the granularity of sub-burst, so that each sub-burst
of the service flow can access the mapped timeslot resources.
However, the intermediate node does not need to maintain this mapping
state. The intermediate node only access the timeslot resources
based on the timeslot id carried in the packets or indicated by FIB
entries.
The entry node determines the appropriate outgoing timeslot and sends
the packet according to the periodic arrival time of the sub-burst,
and the maintained mapping relationship between the sub-burst of
service flow and the outgoing timeslot.
The relationship between the incoming timeslot and the outgoing
timeslot can be installed on the intermediate node or carried in the
packet, so that the packet can access the corresponding outgoing
timeslot on the intermediate node.
Peng, et al. Expires 6 January 2024 [Page 22]
Internet-Draft Timeslot Queueing and Forwarding July 2023
Note that the incoming and outgoing timeslots mentioned here are both
timeslot id within the orchestration period.
It should be noted that the forwarding outgoing port for the service
flow is still determined according to the traditional routing entries
(e.g, Segment Routing), but the outgoing timeslot used by the packet
is determined by the timeslot resource reservation information.
3.2.1. Conversion of Timeslot ID
Figure 1 shows that the scheduling period implemented on the
forwarding plane is not completely equivalent to the orchestration
period of the control plane. The scheduling period includes M
timeslots (from 0 to M-1), while the orchestration period includes N
timeslots (from 0 to N-1). Therefore, it is necessary to convert the
outgoing timeslot of the orchestration period to the target timeslot
of the scheduling period, and insert the packet to the queue
corresponding to the target timeslot for transmission.
A simple conversion method is:
* target timeslot = outgoing timeslot % M
This is safe because during resource reservation, o < M is always
followed, and N is an integer multiple of M.
In the orchestration period, from timeslot 0 to M-1 is the first
scheduling period, from timeslot M to slot 2M-1 is the second
scheduling period, and so on. From timeslot N-M to slot N-1 is the
N/M scheduling period. Each timeslot in the scheduling period
corresponds to an associated queue, which is used to store packets
for sending in the corresponding timeslot.
According to the timeslot resource reservation process mentioned
above, when the sub-burst corresponding to any outgoing timeslot
(e.g, z) arrived at the outgoing port of any node of the path, the
ongoing sending timeslot (e.g, j) in the orchestration period of the
outgoing port must be offset by o before the outgoing timeslot (z),
and meet o < M, which means that the sub-burst does not randomly
arrive at this node, but strictly abide by the time so that when it
reaches the outgoing port, it will definitely fall into the ongoing
sending timeslot (j).
Next, we briefly demonstrate that the sub-burst that arrives at the
outgoing port during the ongoing sending timeslot (j) can be safely
inserted into the corresponding queue in the scheduling period, and
that queue will not overflow.
Peng, et al. Expires 6 January 2024 [Page 23]
Internet-Draft Timeslot Queueing and Forwarding July 2023
Assuming that each timeslot in the orchestration period has a virtual
queue, the length of the virtual queue is the MRB of that timeslot.
For example, termed the virtual queue corresponding to the outgoing
timeslot (z) as queue-z, the packets that can be inserted into
queue-z may only come from the following bursts:
During the ongoing sending timeslot j = (z-M+1+N)%N, the bursts
that arrive at the outgoing port, that is, these bursts may
reserve the outgoing timeslot (z) according to o = M-1.
During the ongoing sending timeslot j = (z-M+2+N)%N, the bursts
that arrive at the outgoing port, that is, these bursts may
reserve the outgoing timeslot (z) according to o = M-2.
... ...
During the ongoing sending timeslot j = (z-1+N)%N, the bursts that
arrive at the outgoing port, that is, these bursts may reserve the
outgoing timeslot (z) according to o = 1;
The total reserved amount of all these bursts does not exceed the
MRB of the outgoing timeslot (z).
Then, when the ongoing sending timeslot changes to z, queue-z will be
sent and cleared. In the following time, starting from timeslot z+1
to the last timeslot N-1 in the orchestration period, there are no
longer any packets inserted into queue-z. Obviously, this virtual
queue is a great waste of queue resources. In fact, queue-z can be
reused by the subsequent outgoing timeslot (z+M)%N. Namely:
During the ongoing sending timeslot j = (z+1)%N, the bursts that
arrive at the outgoing port, that is, these bursts may reserve the
outgoing timeslot (z+M)%N according to o = M-1.
During the ongoing sending timeslot j = (z+2)%N, the bursts that
arrive at the outgoing port, that is, these bursts may reserve the
outgoing timeslot (z+M)%N according to o = M-2.
... ...
During the ongoing sending timeslot j = (z+M-1)%N, the bursts that
arrive at the outgoing port, that is, these bursts may reserve the
outgoing timeslot (z+M)%N according to o = 1.
The total reserved amount of all these bursts does not exceed the
MRB of the outgoing timeslot (z+M)%N.
Peng, et al. Expires 6 January 2024 [Page 24]
Internet-Draft Timeslot Queueing and Forwarding July 2023
It can be seen that queue-z can be used by any outgoing timeslot
(z+k*M)%N, where k is a non negative integer. By observing
(z+k*M)%N, it can be seen that the minimum z satisfies 0<= z< M, that
is, the entire orchestration period actually only requires M queues
to store packets, which are the queues corresponding to M timeslots
in the scheduling period. That is to say, the minimum z is the
timeslot id in the scheduling period, while the outgoing timeslot
(z+k*M)% N is the timeslot id in the orchestration period. The
latter obtains the former by moduling M, which can then access the
queue corresponding to the former. In short, the reason why a queue
can store packets from multiple outgoing timeslots without being
overflowed is that the packets stored in the queue earlier (more than
M timeslots ago) have already been sent.
4. Global Timeslot ID
The outgoing timeslots we discussed in the previous sections are
local timeslots style for all nodes. This section discusses the
situation based on global timeslot style.
Global timeslot style refers to that all nodes in the path are
identified with the same timeslot id, which of course requires all
nodes to use the same timeslot length. The advantages are that the
resource reservation based on global timeslots is simple, always
reserving a specified outgoing timeslot for the service flow. There
is no need to establish a local timeslot mapping relationship on each
node or carry this mapping relationship in packets. The packet only
needs to carry the unique global timeslot id. However, the
disadvantage is that the latency performance of the path may be
large, which depends on the phase difference between the inherent
orchestration periods between the adjacent nodes. Another
disadvantage is that the success rate of finding a path that matches
the service requirements is not as high as local timeslot style.
Global timeslot style requires that the orchestration period is equal
to the scheduling period, mainly considering that arrival packets
with any global timeslot id can be successfully inserted into the
corresponding queue. However, as the scheduling period is less than
the orchestration period is the ideal design goal, further research
is needed on other methods(such as basically aligning orchestration
period between nodes), to ensure that packets with any global
timeslot id can queue normally when the scheduling period is less
than the orchestration period.
Compared to the local timeslot style, global timeslot style means
that the incoming timeslot i must map to the outgoing timeslot i too.
As the example shown in Figure 7, each orchestration period contains
6 timeslots. Node V has three connected upstream nodes U1, U2, and
Peng, et al. Expires 6 January 2024 [Page 25]
Internet-Draft Timeslot Queueing and Forwarding July 2023
U3. During each hop forwarding, the packet accesses the outgoing
timeslot corresponding to the global timeslot id and forwards to the
downstream node with the global timeslot id unchanged. For example,
U1 sends some packets with global slot-id 0, termed as g0, in the
outgoing timeslot 0. The packets with other global slot-id 1~5 are
similarly termed as g1~g5 respectively. The figure shows the
scheduling results of these 6 batches of packets sent by upstream
nodes when node V continues to send them.
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
U1 | g0| g1| g2| | | | | | | | | | | | |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
1 2 3 4 5 0 1 2 3 4 5 0 1 2 3
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
U2 | | | g3| g4| | | | | | | | | | | |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
5 0 1 2 3 4 5 0 1 2 3 4 5 0 1
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
U3 | g5| | | | | | | | | | | | | | |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
V | | | | g3| g4| g5| g0| g1| g2| | | | | | |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Figure 7
In this example, the mapping relationship of the outgoing timeslot
from U1 and the ongoing sending timeslot of V is i -> i, so the
reserved outgoing timeslot for the incoming timeslot i is i+6. The
mapping relationship of the outgoing timeslot from U2 and the ongoing
sending timeslot of V is i -> i-1, so the reserved outgoing timeslot
for the incoming timeslot i is i. And, the mapping relationship of
the outgoing timeslot from U3 and the ongoing sending timeslot of V
is i -> i+1, so the reserved outgoing timeslot for the incoming
timeslot i is i+6-1.
For the headend, the residence delay depends on the arrival time when
the sub-burst arrives at the scheduler and specified global timeslot.
Suppose that the ongoing sending timeslot is j at the arrival time
Peng, et al. Expires 6 January 2024 [Page 26]
Internet-Draft Timeslot Queueing and Forwarding July 2023
when the sub-burst arrives at the scheduler, and there is time T_j
left before the end of the timeslot j, and the sub-burst is specified
to use global timeslot i, then, the reserved outgoing timeslot is
(j+o)%N, where o equals (N+i-j)%N. The residence delay equation for
headend is similar to Section 3.1.4.1.
For any other nodes, suppose that the incoming timeslot i mapped to
the ongoing sending timeslot j, and there is time T_ij left before
the end of the timeslot j. Then, the reserved outgoing timeslot is
(j+o)%N, where o equals (N+i-j)%N. The residence delay equation for
intermediate node is similar to Section 3.1.4.2.
For example, the packets g3 sent by upstream node U2 falls into the
ongoing sending timeslot 2 of node V, it can be sent in outgoing
global timeslot 3. In this case, the residency delay in the node V
is small. While, the packets g5 sent by upstream node U3 falls into
the ongoing sending timeslot 0 of node V, so it needs to wait for
timeslot 0, 1, 2, 3, 4 to be sent in global outgoing timeslot 5. In
this case, the residency delay in the node V is large.
For example, the packets g0 sent by upstream node U1 fall into the
ongoing sending timeslot 0 of node V, the packets need to wait for
the end of the ongoing sending period to be sent in the global
outgoing timeslot 0 in the next round of orchestration period, which
will introduce a large node residency delay. It should be noted that
in this case, the packets g0, when they fall into the ongoing sending
timeslot 0, cannot be placed in the buffer corresponding to timeslot
0. Instead, it needs to be stored in a buffer prior to the TQF
scheduler (such as the buffer on the input port side) for a fixed
latency (such as a fixed timeslot) and then released to the timeslot
scheduler. This fixed-latency buffer is only created for specific
upstream nodes. It can be determined according to the initial
detection result of the mapping relationship between the outgoing
timeslot of the upstream node and the ongoing sending timeslot of
this node. If the initial detection result is slot-id i -> slot-id
i, it needs to be introduced, otherwise it is unnecessary. After the
introduction of fixed-latency buffer, the new detection result will
no longer be i -> i.
The end-to-end delay equation for intermediate node is similar to
Section 3.1.4.4.
5. Summary of Timeslot Style
Depending on the strategy of reserving timeslot resources, different
timeslot styles will be presented, as shown in the table below.
Peng, et al. Expires 6 January 2024 [Page 27]
Internet-Draft Timeslot Queueing and Forwarding July 2023
+===============+========================+==================+
| Strategy | Timeslot Style | Referrence |
+===============+========================+==================+
| Flexible o | Local timeslot style | section 3.1.4 |
| (1<=o<M) | | |
+---------------+------------------------+------------------+
| Constant o | Global timeslot style | section 4 |
| (o=(N+i-j)%N) | | |
+---------------+------------------------+------------------+
| Constant o | Multi-CQF | [Multi-CQF] |
| (o=1) | | |
+---------------+------------------------+------------------+
Figure 8
6. In-time Scheduling
So far, the TQF mechanism presented above, both for local timeslot
style and golobal timeslot style, is to reserve a fixed outgoing
timeslot for the sub-burst in the orchestration period, and just send
the sub-burst in that timeslot. This is on-time scheduling.
In this section, we discuss another scheduling variant of TQF, i.e.,
in-time scheduling. In this case, timeslot resources are still
reserved based on delay requirement, but in actual forwarding,
packets do not necessarily have to wait until the reserved outgoing
timeslot for sending.
An in-time TQF scheduler based on PIFO (put in first out) will be
described in later versions.
7. Queue Design
7.1. Queue Design of On-time Scheduler
The number of timeslot queues should be designed according to the
number of timeslots included in the scheduling period. Each timeslot
corresponds to a separate queue (or queue group), in which the
buffered packets must be able to be sent within a timeslot.
The length of the queue, i.e., the total number of bits that can be
reserved or sent for a timeslot, does not have to be set to be
exactly equal to the link rate multiplied by the timeslot length.
This is because the bandwidth requirements of other non-deterministic
services and protocols running in the network should also be
considered.
Peng, et al. Expires 6 January 2024 [Page 28]
Internet-Draft Timeslot Queueing and Forwarding July 2023
7.1.1. Full Queues
When the scheduling period length is equal to the orchestration
period length, the node will implement full queues. The advantage is
that the actual forwarding resources are the same view as the
resources used for reservation, so that the resource reservation
process is simple. However, the disadvantage is that because the
scheduling period is generally large to cover all services
requirements, the number of queues maintained by the node will be
large.
For example, if the accumulated length of all queues supported by the
hardware is 4G bytes, the queue length corresponding to a timeslot of
10us at a port rate of 100G bps is 1M bits, then a maximum of 32K
timeslot queues can be provided, and the maximum length of the
orchestration period supported is 320ms. However, considering the
queue resource requirements of other non-deterministic services, the
TQF function can only use some of the queue resources, such as
10K~20K queues. In this case, the length of the orchestration period
supported by the node is 100~200 ms.
7.1.2. Non-full Queues
When the length of the scheduling period is less than the length of
the orchestration period, the node will implement a non-full queues.
The advantages and disadvantages are opposite to the full queues
option. The actual forwarding resources are inconsistent with the
view of the resources reservation. But the number of queues
maintained by the node is small.
7.2. Queue Design of In-time Scheduler
A single PIFO may be used for in-time scheduling. More details will
be provided in later versions.
8. Multiple Orchestration Periods
A single orchestration period may not be able to cover a wide range
of service needs, such as some with a burst interval of microseconds,
while others have a burst interval of minutes or even larger. When
using a single orchestration period to simultaneously serve these
services, the timeslot length must be microseconds, but the
orchestration period length is minutes or more, resulting in the need
to include a large number of timeslots in the orchestration period.
The final result is a proportional increase in the number of queues
required for the scheduling period (to avoid the potential timeslot
conflicts).
Peng, et al. Expires 6 January 2024 [Page 29]
Internet-Draft Timeslot Queueing and Forwarding July 2023
Multiple orchestration periods each with different length may be
provided by the network. A TQF enabled link can be configured with
multiple TQF scheduling instances each corresponding to specific
orchestration period length. For simplicity,the orchestration period
length itself can be used to identify a specific instance.
For example, one orchestration period length is 300 us, termed as
LOP-300us, which is the LCM of the burst interval of the set of flows
served. Another orchestration period length is 100 ms, termed as
LOP-100ms, which is the LCM of the burst interval of another set of
flows served. Each orchestration period instance has its own
timeslot length. The timeslot length of a long orchestration period
instance should be longer than that of a short orchestration period
instance, and the former is an integer multiple of the latter. But
the long orchestration period itself may not necessarily be an
integer multiple of the short orchestration period.
As shown in Figure 9, both link-a and link-B are configured with n
orchestration period instances, with the corresponding orchestration
period lengths LOP_1, LOP_2, ..., LOP_n in descending order. For
each orchestration period length LOP_i, the bandwidth resource
allocated is BW_U_i for node U (or BW_V_i for node V), and the
timeslot length is LT_U_i for node U (or LT_V_i for node V). For
each TQF enabled link, the sum of bandwidth resources allocated to
all orchestration period instances must not exceed the total
bandwidth of the link.
+---+ link-a +---+ link-b +---+
| U | -------------------- | V | -------------------- | W |
+---+ +---+ +---+
LOP_1: LOP_1:
LT_U_1 LT_V_1
BW_U_1 BW_V_1
LOP_2: LOP_2:
LT_U_2 LT_V_2
BW_U_2 BW_V_2
... ... ... ...
LOP_n: LOP_n:
LT_U_n LT_V_n
BW_U_n BW_V_n
Figure 9
Due to the fact that long orchestration periods serve service flows
with large burst intervals, for a given burst size, the larger the
burst interval, the less bandwidth consumed by the service flow.
Peng, et al. Expires 6 January 2024 [Page 30]
Internet-Draft Timeslot Queueing and Forwarding July 2023
Therefore, it is recommended that the bandwidth resources allocated
to long orchestration period instances are less than those allocated
to short orchestration period instances, which is also beneficial for
reducing the queue length required for long orchestration period
instances with on-time mode.
Interworking between different nodes is based on the same
orchestration period instance. That means that the timeslot mapping
described in Section 3.1.1 should be maintained in the context of the
specific orchestration period instance, and the timeslot resource
reservation along the path for a sub-task should also be in the
context of the specific orchestration period instance. The
orchestration period length should be carried in the forwarding
packets to let the service flow to access the timeslot resources
corresponding to the orchestration period instance.
For on-time mode, each orchestration period instance has its own
separate queue set. Time division multiplexing scheduling is based
on the granularity of the minimum timeslot length of all instances.
Within each time unit of this granularity, the queues in the sending
state of all instances are always scheduled in the order of LOP_1,
LOP_2, ..., LOP_n.
For in-time mode, all orchestration period instances may share a
single PIFO.
9. Admission Control on the Headend
On the network entry, traffic regulation must be performed on the
incoming port, so that the service flow does not exceed its T-SPEC
such as burst interval, burst size, maximum packet size, etc. This
kind of regulation is usually the shaping using leaky bucket combined
with the incoming queue that receives service flow. A service flow
may contain discrete multiple sub-bursts within its periodic burst
interval. The leaky bucket depth should be larger than the maximum
packet size, and should be consistent with the reserved burst
resources required for the maximum sub-burst.
The scheduling mechanism described in this document has a requirement
on the arrival time of service flows on the network entry. It is
hoped that the distribution of sub-bursts (after regulation) of the
service flow will always appear in a fixed position within the
orchestration period. Based on tihs ideal position, any packets of
the service flow will be matched to the sub-burst forwarding state
whose outgoing timeslot follows closely behind the position, then get
the outgoing timeslot for sending. Note that the network entry may
maintain multiple sub-burst forwarding states for a single service
flow, due to many bursts within the service burst interval.
Peng, et al. Expires 6 January 2024 [Page 31]
Internet-Draft Timeslot Queueing and Forwarding July 2023
For example, the network entry may maintain up to 3 sub-burst
forwarding states for a flow. Ideally, all packets of this flow are
split into 3 sub-bursts after regulation, each sub-burst matching one
of the states. Here, 3 is the maximum sub-bursts for this flow, and
it does not always contain so many bursts within the burst interval
during actual sending.
For a specific sub-burst, some amount of deviation (i.e., the
deviation between the actual arrival position and the ideal arrival
position within the orchestration period) is permitted for its
position if it want to lock the outgoing timeslot gotten from the
matched forwarding state.
For on-time scheduling, the position deviation should not exceed o-1
for late arrival case, or M-o-1 for early arrival case, where o is
the offset between the reserved outgoing timeslot and ongoing sending
timeslot as mentioned above. Intuitively, large o can tolerate large
late arrival deviations, while small o (or large M even for large o)
can tolerate large early arrival deviations.
This position deviation limitation is beneficial for on-time
scheduling, to achieve the ideal design goal that scheduling period
is smaller than the orchestration period, and packets can always be
successfully inserted into the scheduling queue without conflicts.
Otherwise, for randomly arriving service flows, it can be supported
by taking a large M (or even M = N) (option-1) to accommodate random
arrival, or it can be supported by introducing an explicit buffer put
before the scheduler on the network entry to let the arrival time
always meet the fixed position (option-2).
* Note that due to randomness of arrival time, the packet may just
miss the scheduling (or arrive too earlier) and need to wait in
the scheduling queue (in the case of option-1) or the explicit
buffer (in the case of option-2) for the next orchestration
period. From this perspective, we suggest that it is best for
service flows to strictly obey their arrival time, which should be
the ideal admission control for all scheduling mechanisms that
attempt to forward service flows in the specified time window.
For in-time scheduling, the position deviation should not exceed o-1
for late arrival case. We only focus on late arrivals here, as in-
time scheduling naturally handles early arrivals. If the late
arrival exceed the above limitation, the sub-burst may need to be
sent during the next orchestration period in the worst case, or may
be lucky to be schueduled immediately.
Peng, et al. Expires 6 January 2024 [Page 32]
Internet-Draft Timeslot Queueing and Forwarding July 2023
10. Frequency Synchronization
The basic explanation for frequency synchronization is that the
crystal frequency of the hardware is consistent, which enables all
nodes in the network to be in the same inertial frame and have the
same time lapse rate. This is a prerequisite for all latency based
scheduling mechanisms. This frequency synchronization mechanism
(such as syncE) is not within the scope of this document.
Sometimes, people also refer to the frequency asynchrony as the
timeslot rotation frequency difference caused by different node
configurations with different timeslot lengths. This document
supports the interconnection between nodes with this type of
frequency asynchrony.
11. Evaluations
This section gives the evaluation results of the TQF mechanism based
on the requirements that is defined in
[I-D.ietf-detnet-scaling-requirements].
Peng, et al. Expires 6 January 2024 [Page 33]
Internet-Draft Timeslot Queueing and Forwarding July 2023
+======================+============+===============================+
| requiremens | Evaluation | Notes |
+======================+============+===============================+
| 3.1 Tolerate Time | Partial | No time synchronization needed|
| Asynchrony | | , but need frequency sync. |
+----------------------+------------+-------------------------------+
| 3.2 Support Large | | The detection of timeslot |
| Single-hop | Yes | mapping covers link |
| Propagation | | propagation delay. |
| Latency | | |
+----------------------+------------+-------------------------------+
| 3.3 Accommodate the | | The higher service rate, the |
| Higher Link | Partial | more buffer needed for the |
| Speed | | same timeslot length. |
+----------------------+------------+-------------------------------+
| 3.4(1) Be Scalable | | Calculating paths for as many |
| to a Large | Partial | flows as possible is an |
| Number of Flows | | NP-hard problem. |
+----------------------+------------+-------------------------------+
| 3.4(2) Tolerate High | | The unused bandwidth of the |
| Utilization | Yes | timeslot can be used by |
| | | best-effot flows. |
+----------------------+------------+-------------------------------+
| 3.5 Prevent Flow | | Flows are permitted based on |
| Fluctuation from | Yes | timeslot reservation, isolated|
| Disrupting | | from each other through |
| Service | | timeslots. |
+----------------------+------------+-------------------------------+
| 3.6 Tolerate Failures| | Independent of queueing |
| of Links or Nodes| N/A | mechanism. |
| and Topology | | |
| Changes | | |
+----------------------+------------+-------------------------------+
| 3.7 Be scalable to a | | Calculating TQF paths for all |
| Large Number of | Partial | services is NP-hard problem, |
| Hops with Complex| | related to hops count. |
| Topology | | |
+----------------------+------------+-------------------------------+
| 3.8 Support Multi- | | Independent of queueing |
| Mechanisms in | N/A | mechanism. |
| Single Domain and| | |
| Multi-Domains | | |
+----------------------+------------+-------------------------------+
Figure 10
Peng, et al. Expires 6 January 2024 [Page 34]
Internet-Draft Timeslot Queueing and Forwarding July 2023
12. IANA Considerations
TBD.
13. Security Considerations
TBD.
14. Acknowledgements
TBD.
15. References
15.1. Normative References
[I-D.chen-detnet-sr-based-bounded-latency]
Chen, M., Geng, X., and Z. Li, "Segment Routing (SR) Based
Bounded Latency", Work in Progress, Internet-Draft, draft-
chen-detnet-sr-based-bounded-latency-02, 13 March 2023,
<https://datatracker.ietf.org/doc/html/draft-chen-detnet-
sr-based-bounded-latency-02>.
[I-D.eckert-detnet-tcqf]
Eckert, T. T., Li, Y., Bryant, S., Malis, A. G., Ryoo, J.,
Liu, P., Li, G., Ren, S., and F. Yang, "Deterministic
Networking (DetNet) Data Plane - Tagged Cyclic Queuing and
Forwarding (TCQF) for bounded latency with low jitter in
large scale DetNets", Work in Progress, Internet-Draft,
draft-eckert-detnet-tcqf-03, 19 June 2023,
<https://datatracker.ietf.org/doc/html/draft-eckert-
detnet-tcqf-03>.
[I-D.ietf-detnet-scaling-requirements]
Liu, P., Li, Y., Eckert, T. T., Xiong, Q., Ryoo, J.,
zhushiyin, and X. Geng, "Requirements for Scaling
Deterministic Networks", Work in Progress, Internet-Draft,
draft-ietf-detnet-scaling-requirements-02, 24 May 2023,
<https://datatracker.ietf.org/doc/html/draft-ietf-detnet-
scaling-requirements-02>.
[I-D.peng-lsr-deterministic-traffic-engineering]
Peng, S., "IGP Extensions for Deterministic Traffic
Engineering", Work in Progress, Internet-Draft, draft-
peng-lsr-deterministic-traffic-engineering-00, 22 May
2023, <https://datatracker.ietf.org/doc/html/draft-peng-
lsr-deterministic-traffic-engineering-00>.
Peng, et al. Expires 6 January 2024 [Page 35]
Internet-Draft Timeslot Queueing and Forwarding July 2023
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
15.2. Informative References
[ATM-LATENCY]
"Bounded Latency Scheduling Scheme for ATM Cells", 1999,
<https://ieeexplore.ieee.org/document/780828/>.
[CQF] "Cyclic queueing and Forwarding", 2017,
<https://ieeexplore.ieee.org/document/7961303>.
[Multi-CQF]
"Multiple Cyclic queueing and Forwarding", 2021,
<https://www.ieee802.org/1/files/public/docs2021/new-finn-
multiple-CQF-0921-v02.pdf>.
[TAS] "Time-Aware Shaper", 2015,
<https://standards.ieee.org/ieee/802.1Qbv/6068/>.
Authors' Addresses
Shaofu Peng
ZTE
China
Email: peng.shaofu@zte.com.cn
Peng Liu
China Mobile
China
Email: liupengyjy@chinamobile.com
Kashinath Basu
Oxford Brookes University
United Kingdom
Email: kbasu@brookes.ac.uk
Peng, et al. Expires 6 January 2024 [Page 36]
Internet-Draft Timeslot Queueing and Forwarding July 2023
Aihua Liu
ZTE
China
Email: liu.aihua@zte.com.cn
Dong Yang
Beijing Jiaotong University
China
Email: dyang@bjtu.edu.cn
Guoyu Peng
Beijing University of Posts and Telecommunications
China
Email: guoyupeng@bupt.edu.cn
Peng, et al. Expires 6 January 2024 [Page 37]