IPPM H. Song, Ed.
Internet-Draft Futurewei
Intended status: Informational T. Zhou
Expires: October 15, 2020 Z. Li
Huawei
J. Shin
SK Telecom
K. Lee
LG U+
April 13, 2020
Postcard-based On-Path Flow Data Telemetry
draft-song-ippm-postcard-based-telemetry-07
Abstract
The document describes a variation of the Postcard-Based Telemetry
(PBT), the marking-based PBT. Unlike the instruction-based PBT, as
embodied in [I-D.ietf-ippm-ioam-direct-export], the marking-based PBT
does not require the encapsulation of a telemetry instruction header
so it avoids some of the implementation challenges of the
instruction-based PBT. This documents discuss the issues and
solutions of the marking-based PBT.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 15, 2020.
Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved.
Song, et al. Expires October 15, 2020 [Page 1]
Internet-Draft Postcard-Based Telemetry April 2020
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. PBT-M: Marking-based PBT . . . . . . . . . . . . . . . . . . 4
3. New Challenges . . . . . . . . . . . . . . . . . . . . . . . 6
4. Considerations on PBT-M Design . . . . . . . . . . . . . . . 6
4.1. Packet Marking . . . . . . . . . . . . . . . . . . . . . 7
4.2. Flow Path Discovery . . . . . . . . . . . . . . . . . . . 7
4.3. Packet Identity for Export Data Correlation . . . . . . . 8
4.4. Avoid Packet Marking through Node Configuration . . . . . 8
5. Postcard Format . . . . . . . . . . . . . . . . . . . . . . . 9
6. Security Considerations . . . . . . . . . . . . . . . . . . . 9
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 9
9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9
10. Informative References . . . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11
1. Motivation
In order to gain detailed data plane visibility to support effective
network OAM, it is important to be able to examine the trace of user
packets along their forwarding paths. Such on-path flow data reflect
the state and status of each user packet's real-time experience and
provide valuable information for network monitoring, measurement, and
diagnosis.
The telemetry data include but not limited to the detailed forwarding
path, the timestamp/latency at each network node, and, in case of
packet drop, the drop location and reason. The emerging programmable
data plane devices allow user-defined data collection or conditional
data collection based on trigger events. Such on-path flow data are
from and about the live user traffic, which complement the data
acquired through other passive and active OAM mechanisms such as
IPFIX [RFC7011] and ICMP [RFC2925].
On-path telemetry was developed to cater the need for collecting on-
path flow data. There are two basic modes for on-path telemetry: the
Song, et al. Expires October 15, 2020 [Page 2]
Internet-Draft Postcard-Based Telemetry April 2020
passport mode and the postcard mode. In the passport mode, each node
on the path adds the telemetry data to the user packets (i.e., stamp
the passport). The accumulated data trace carried by user packets
are exported at a configured end node. In the postcard mode, each
node directly exports the telemetry data using an independent packet
(i.e., send a postcard) to avoid the need of carrying the data with
user packets.
In-situ OAM trace option (IOAM) [I-D.ietf-ippm-ioam-data] is a
representative of the passport mode on-path telemetry. A prominent
advantage of the passport mode is that it naturally retains the
telemetry data correlation along the entire path. The passport mode
also reduces the number of data export packets. These help to
simplify the data collector and analyzer's work. On the other hand,
the passport mode faces the following challenges.
o Issue 1: Since the telemetry instruction header and data
processing must be done in the data-plane fast-path, it may
interfere with the normal traffic forwarding (e.g., leading to
forwarding performance degradation) and lead to inaccurate
measurements (e.g., resulting in longer latency measurements than
usual). This undesirable "observer effect" is problematic to
carrier networks where stringent SLA must be observed.
o Issue 2: The passport mode may significantly increase the user
packet's original size by adding data at each on-path node. The
size may exceed the path MTU so either the techniuqe cannot apply
or the packet needs to be fragmented. This is especially
troubling when some other network service headers (e.g., segment
routing or service functoin chaining) are also present. Limiting
the data size or path length reduces the effectiveness of INT.
o Issue 3: The instruction header needs to be encapsulated into user
packets for transport. [I-D.brockners-inband-oam-transport] has
discussed several encapsulation approaches for different transport
protocols. However, There is no feasible solutions so far to
encapsulate the instruction header in MPLS and IPv4 networks which
are still the most widely deployed. It is also challenging to
encapsulate the instruciton header in IPv6
[I-D.song-ippm-ioam-ipv6-support].
o Issue 4: Transported in plain text along the network paths, the
instruction header and data are vulnerable to eavesdropping and
tampering as well as DoS attack. Extra protective measurement is
difficult on the data-plane fast-path.
o Issue 5: Since the passport mode only exports the telemetry data
at the designated end node, if the packet is dropped in the
Song, et al. Expires October 15, 2020 [Page 3]
Internet-Draft Postcard-Based Telemetry April 2020
network, the data will be lost as well. It cannot pinpoint the
packet drop location which is desired by fault diagnosis. Even
worse, the end node may be unaware of the packet and data loss at
all.
The postcard mode provides a perfect complement to the passport mode.
In postcard-based telemetry (PBT), the postcards that carry telemetry
data can be generated by a node's slow path and transported in band
or out of band, independent of the original user packets. IOAM
direct export option (DEX) [I-D.ietf-ippm-ioam-direct-export] is a
representative of PBT. Since an instruction header is still needed,
while successfully addressing the Issue 2 and 5 and partially
addressing the Issue 1 and 4, this type of instruction-based PBT
still cannot address the Issue 3.
This document describes another variation of the postcard mode on-
path telemetry, the marking-based PBT (PBT-M). Unlike the
instruction-based PBT, the marking-based PBT does not require the
encapsulation of a telemetry instruction header so it avoids some of
the implementation challenges of the instruction-based PBT. This
documents discuss the issues and solutions of the marking-based PBT.
2. PBT-M: Marking-based PBT
As the name suggests, PBT-M only needs a marking-bit in the existing
headers of user packets to trigger the telemetry data collection and
export. The sketch of PBT-M is as follows. The user packet, if its
on-path data need to be collected, is marked at the path head node.
At each PBT-aware node, if the mark is detected, a postcard (i.e.,
the dedicated OAM packet triggered by a marked user packet) is
generated and sent to a collector. The postcard contains the data
requested by the management plane. The requested data are configured
by the management plane through data set templates (as in IPFIX
[RFC7011]). Once the collector receives all the postcards for a
single user packet, it can infer the packet's forwarding path and
analyze the data set. The path end node is configured to unmark the
packets to its original format if necessary.
The overall architecture of PBT-M is depict in Figure 1.
Song, et al. Expires October 15, 2020 [Page 4]
Internet-Draft Postcard-Based Telemetry April 2020
+------------+ +-----------+
| Network | | Telemetry |
| Management |(-------| Data |
| | | Collector |
+-----:------+ +-----------+
: ^
:configurations |postcards (OAM pkts)
: |
...............:.....................|........
: : : | :
: +---------:---+-----------:---+--+-------:---+
: | : | : | : |
V | V | V | V |
+------+-+ +-----+--+ +------+-+ +------+-+
usr pkts | Head | | Path | | Path | | End |
====>| Node |====>| Node |====>| Node |====>| Node |====>
| | | A | | B | | |
+--------+ +--------+ +--------+ +--------+
gen postcards gen postcards gen postcards gen postcards
mark usr pkts unmark usr pkts
Figure 1: Architecture of PBT-M
PBT-M aims to fully address the issues listed above. It also
introduces some new benefits. The advantages of PBT-M are as
follows.
o 1: PBT-M avoid augmenting user packets with new headers and
introducing new data plane protocols. The telemetry data
collecting signaling remains in data plane.
o 2: PBT-M is extensible for collecting arbitrary new data to
support possible future use cases. The data set to be collected
can be configured through management plane or control plane.
Since there is no limitation on the types of data, any data other
than those defined in [I-D.ietf-ippm-ioam-data] can also be
collected. Since there is no size constraints any more, it is
free to use the more flexible data set template for data type
definition.
o 3: PBT-M avoids interfering the normal forwarding and affecting
the forwarding performance. Hence, the collected data are free to
be transported independently through in-band or out-of-band
channels. The data collecting, processing, assembly,
encapsulation, and transport are therefore decoupled from the
forwarding of the corresponding user packets and can be performed
in data-plane slow-path if necessary.
Song, et al. Expires October 15, 2020 [Page 5]
Internet-Draft Postcard-Based Telemetry April 2020
o 4: For PBT-M, the types of data collected from each node can vary
depending on application requirements and node capability. This
is either impossible or very difficult to be supported by the
passport mode in which data types collected per node are conveyed
by the instruction header.
o 5: PBT-M makes it easy to secure the collected data without
exposing it to unnecessary entities. For example, both the
configuration and the telemetry data can be encrypted before being
transported, so passive eavesdropping and man-in-the-middle attack
can both be deterred.
o 6: Even if a user packet under inspection is dropped at some node
in network, the postcards that are collected from the previous
nodes are still valid and can be used to diagnose the packet drop
location and reason.
3. New Challenges
Although PBT-M addresses the issues of the passport mode telemetry
and the instruction-based PBT, it introduces a few new challenges.
o Challenge 1: A user packet needs to be marked in order to trigger
the path-associated data collection. Since we do not want to
augment user packets with any new header fields, we must reuse
some bit from existing header fields.
o Challenge 2: Since the packet header will not carry OAM
instructions any more, the data plane devices need to be
configured to know what data to collect. However, in general, the
forwarding path of a flow packet (due to ECMP or dynamic routing)
is unknown beforehand (note that there are some notable exceptions
such as segment routing). Configuring the data set for each flow
at all data plane devices is expensive in terms of configuration
load and data plane resources.
o Challenge 3: Due to the variable transport latency, the dedicated
postcard packets for a single packet may arrive at the collector
out of order or be dropped in networks for some reason. In order
to infer the packet forwarding path, the collector needs some
information from the postcard packets to identify the user packet
affiliation and the order of path node traversal.
4. Considerations on PBT-M Design
To address the above challenges, we propose several design details of
PBT-M.
Song, et al. Expires October 15, 2020 [Page 6]
Internet-Draft Postcard-Based Telemetry April 2020
4.1. Packet Marking
To trigger the path-associated data collection, usually a single bit
from some header field is sufficient. While no such bit is
available, other packet marking techniques are needed. we discuss
three possible application scenarios.
o IPv4. IPFPM [I-D.ietf-ippm-alt-mark] is an IP flow performance
measurement framework which also requires a single bit for packet
coloring. The difference is that IPFPM does in-network
measurement while PBT-M only collects and exports data at network
nodes (i.e., the data analysis is done at the collector rather
than in the network nodes). IPFPM suggests to use some reserved
bit of the Flag field or some unused bit of the TOS field.
Actually, IPFPM can be considered a subcase of PBT-M so the same
bit can be used for PBT-M. The management plane is responsible to
configure the actual operation mode.
o SFC NSH. The OAM bit in NSH header can be used to trigger the on-
path data collection [I-D.ietf-sfc-nsh]. PBT does not add any
other metadata to NSH.
o MPLS. Instead of choosing a header bit, we take advantage of the
synonymous flow label [I-D.bryant-mpls-synonymous-flow-labels]
approach to mark the packets. A synonymous flow label indicates
the on-path data should be collected and forwarded through a
postcard.
o SRv6: A flag bit in SRH can be reserved to trigger the on-path
data collection.
4.2. Flow Path Discovery
In case the path a flow traverses is unknown in advance, all PBT-
aware nodes are configured to react to the marked packets by
exporting some basic data such as node ID and TTL before a data set
template for that flow is configured. This way, the management plane
can learn the flow path dynamically.
If the management plane wants to collect the on-path data for some
flow, it configures the head node(s) with a probability or time
interval for the flow packet marking. When the first marked packet
is forwarded in the network, the PBT-aware nodes will export the
basic data to the collector. Hence, the flow path is identified. If
other types of data need to be collected, the management plane can
further configure the data set template to the target nodes on the
flow's path. The PBT-aware nodes would collect and export data
Song, et al. Expires October 15, 2020 [Page 7]
Internet-Draft Postcard-Based Telemetry April 2020
accordingly if the packet is marked and a data set template is
present.
If for any reason the flow path is changed, the new path nodes can be
learned immediately by the collector, so the management plane
controller can be informed to configure the new path nodes. The
outdated configuration can be automatically timed out or explicitly
revoked by the management plane controller.
4.3. Packet Identity for Export Data Correlation
The collector needs to correlate all the postcard packets for a
single user packet. Once this is done, the TTL (or the timestamp, if
the network time is synchronized) can be used to infer the flow
forwarding path. The key issue here is to correlate all the
postcards for a same user packet.
The first possible approach is to include the flow ID plus the user
packet ID in the OAM packets. For example, the flow ID can be the
5-tuple IP header of the user traffic, and the user packet ID can be
some unique information pertaining to a user packet (e.g., the
sequence number of a TCP packet).
If the packet marking interval is large enough, then the flow ID
itself is enough to identify the user packet. That is, we can assume
all the exported postcard packets for the same flow during a short
period of time belong to the same user packet.
Alternatively, if the network is synchronized, then the flow ID plus
the timestamp at each node can also infer the postcard affiliation.
However, some errors may occur under some circumstances. For
example, if two consecutive user packets from the same flows are both
marked but one exported postcard from a node is lost, then it is
difficult for the collector to decide which user packet the remaining
postcard belongs to. In many cases, such rare error has no
catastrophic consequence therefore is tolerable.
4.4. Avoid Packet Marking through Node Configuration
It is possible to avoid needing to mark user packets yet still
allowing in-band flow data collection. We could simply configure the
Access Control List (ACL) to filter out the set of target flows.
This approach has two potential issues: (1) Since the packet
forwarding path is unknown in advance, one needs to configure all the
nodes in a network to filter the flows and capture the complete data
set. This wastes the precious ACL resource and is not scalable. (2)
If a node cannot collect data for all the filtered packets of a flow,
it needs to determine which packets to sample independently, so the
Song, et al. Expires October 15, 2020 [Page 8]
Internet-Draft Postcard-Based Telemetry April 2020
collector may not be able to receive the full set of postcards for a
same user packet.
Nevertheless, since this approach does not require to touch the user
packets at all, it has its unique merits: (1) User can freely choose
any nodes as vantage points for data collection; (2) No need to worry
that any "modified" user packets to leak out of the PBT domain; (3)
It has the minimum impact to the forwarding of the user traffic.
No data plane standard is required to support this mode, except the
postcard format.
5. Postcard Format
Postcard can use the same data export format as that used by IOAM.
[I-D.spiegel-ippm-ioam-rawexport] proposes a raw format that can be
interpreted by IPFIX.
6. Security Considerations
Several security issues need to be considered.
o Eavesdrop and tamper: the postcards can be encrypted and
authenticated to avoid such security threats.
o DoS attack: PBT can be limited to a single administration domain.
The mark must be removed at the egress domain edge. The node can
rate limit the extra traffic incurred by postcards.
7. IANA Considerations
No requirement for IANA is identified.
8. Contributors
TBD.
9. Acknowledgments
TBD.
10. Informative References
Song, et al. Expires October 15, 2020 [Page 9]
Internet-Draft Postcard-Based Telemetry April 2020
[I-D.brockners-inband-oam-transport]
Brockners, F., Bhandari, S., Govindan, V., Pignataro, C.,
Gredler, H., Leddy, J., Youell, S., Mizrahi, T., Mozes,
D., Lapukhov, P., and R. Chang, "Encapsulations for In-
situ OAM Data", draft-brockners-inband-oam-transport-05
(work in progress), July 2017.
[I-D.bryant-mpls-synonymous-flow-labels]
Bryant, S., Swallow, G., Sivabalan, S., Mirsky, G., Chen,
M., and Z. Li, "RFC6374 Synonymous Flow Labels", draft-
bryant-mpls-synonymous-flow-labels-01 (work in progress),
July 2015.
[I-D.ietf-ippm-alt-mark]
Fioccola, G., Capello, A., Cociglio, M., Castaldelli, L.,
Chen, M., Zheng, L., Mirsky, G., and T. Mizrahi,
"Alternate Marking method for passive and hybrid
performance monitoring", draft-ietf-ippm-alt-mark-14 (work
in progress), December 2017.
[I-D.ietf-ippm-ioam-data]
Brockners, F., Bhandari, S., Pignataro, C., Gredler, H.,
Leddy, J., Youell, S., Mizrahi, T., Mozes, D., Lapukhov,
P., remy@barefootnetworks.com, r., daniel.bernier@bell.ca,
d., and J. Lemon, "Data Fields for In-situ OAM", draft-
ietf-ippm-ioam-data-09 (work in progress), March 2020.
[I-D.ietf-ippm-ioam-direct-export]
Song, H., Gafni, B., Zhou, T., Li, Z., Brockners, F.,
Bhandari, S., Sivakolundu, R., and T. Mizrahi, "In-situ
OAM Direct Exporting", draft-ietf-ippm-ioam-direct-
export-00 (work in progress), February 2020.
[I-D.ietf-sfc-nsh]
Quinn, P., Elzur, U., and C. Pignataro, "Network Service
Header (NSH)", draft-ietf-sfc-nsh-28 (work in progress),
November 2017.
[I-D.song-ippm-ioam-ipv6-support]
Song, H., Li, Z., and S. Peng, "Approaches on Supporting
IOAM in IPv6", draft-song-ippm-ioam-ipv6-support-00 (work
in progress), March 2020.
[I-D.spiegel-ippm-ioam-rawexport]
Spiegel, M., Brockners, F., Bhandari, S., and R.
Sivakolundu, "In-situ OAM raw data export with IPFIX",
draft-spiegel-ippm-ioam-rawexport-01 (work in progress),
October 2018.
Song, et al. Expires October 15, 2020 [Page 10]
Internet-Draft Postcard-Based Telemetry April 2020
[RFC2925] White, K., "Definitions of Managed Objects for Remote
Ping, Traceroute, and Lookup Operations", RFC 2925,
DOI 10.17487/RFC2925, September 2000,
<https://www.rfc-editor.org/info/rfc2925>.
[RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken,
"Specification of the IP Flow Information Export (IPFIX)
Protocol for the Exchange of Flow Information", STD 77,
RFC 7011, DOI 10.17487/RFC7011, September 2013,
<https://www.rfc-editor.org/info/rfc7011>.
Authors' Addresses
Haoyu Song (editor)
Futurewei
2330 Central Expressway
Santa Clara, 95050
USA
Email: hsong@futurewei.com
Tianran Zhou
Huawei
156 Beiqing Road
Beijing, 100095
P.R. China
Email: zhoutianran@huawei.com
Zhenbin Li
Huawei
156 Beiqing Road
Beijing, 100095
P.R. China
Email: lizhenbin@huawei.com
Jongyoon Shin
SK Telecom
South Korea
Email: jongyoon.shin@sk.com
Song, et al. Expires October 15, 2020 [Page 11]
Internet-Draft Postcard-Based Telemetry April 2020
Kyungtae Lee
LG U+
South Korea
Email: coolee@lguplus.co.kr
Song, et al. Expires October 15, 2020 [Page 12]