draft-ding-opsawg-wavelength-use-case-00

Opsawg Working Group                                             X. Ding
Internet-Draft                                                    W. Liu
Intended status: Informational                                    Huawei
Expires: May 3, 2018                                               C. Li
                                                           China Telecom
                                                        October 30, 2017


         Network Data Use Case for Wavelength Division Service
                draft-ding-opsawg-wavelength-use-case-00

Abstract

   This document describes use cases that demonstrate the applicability
   of network data to evaluate the performance of wavelength division
   service.  The objective of this draft is not to cover the wavelength
   division service in detail.  Rather, the intention is to illustrate
   the requirements of network data used to evaluate the performance of
   wavelength division service.

   General characteristics of network data and two typical use cases are
   presented in this document to demonstrate the different application
   scenarios of network data in wavelength division service.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on May 3, 2018.

Copyright Notice

   Copyright (c) 2017 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents



Ding, et al.               Expires May 3, 2018                  [Page 1]


Internet-Draft                 WD Use Case                  October 2017


   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions used in this document . . . . . . . . . . . . . .   3
   3.  Characteristics of network data . . . . . . . . . . . . . . .   3
   4.  Use cases . . . . . . . . . . . . . . . . . . . . . . . . . .   4
     4.1.  Anomaly detection . . . . . . . . . . . . . . . . . . . .   4
     4.2.  Risk assessment . . . . . . . . . . . . . . . . . . . . .   5
   5.  Data Issues . . . . . . . . . . . . . . . . . . . . . . . . .   6
     5.1.  Merge data from different time periods  . . . . . . . . .   6
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   6
   7.  Conclusions . . . . . . . . . . . . . . . . . . . . . . . . .   7
   8.  Normative References  . . . . . . . . . . . . . . . . . . . .   7
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   7

1.  Introduction

   Wavelength-division multiplexing (WDM) is a method of combining
   multiple signals on laser beams at various infrared (IR) wavelengths
   for transmission along fiber optic media.  A WDM system uses a
   multiplexer at the transmitter to join the several signals together,
   and a demultiplexer at the receiver to split them apart.  During the
   wavelength division service running, network data is consistently
   generated from wavelength division devices and it can reflect the
   process of service running.

   In the case of wavelength division service, customer is accustomed to
   handle the network failure after the service interruption.  Such
   passive strategy is inefficient, and easily leads to long service
   interruption.  Network data collected from device is real and
   reliable, and can help customer to predict the trend of wavelength
   division optical performance.  Statistical characteristics of network
   data can help operator to judge the time point at which the service
   is abnormal or normal, or the service is risky or healthy .

   This document attempts to describe the detailed use cases that lead
   to the requirements to support wavelength division performance
   evaluation.  The objective of this draft is not to cover the
   wavelength division service in detail.  Rather, the intention is to




Ding, et al.               Expires May 3, 2018                  [Page 2]


Internet-Draft                 WD Use Case                  October 2017


   illustrate the requirements of network data used to evaluate the
   performance of wavelength division service.

   General characteristics of network data and two typical use cases are
   presented in this document to demonstrate the different application
   scenarios of network data in wavelength division service.  Moreover,
   the question of how to integrate network data collected from
   different time periods is raised.

2.  Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

   KPI: Key Performance Indicator.  Network KPI represents the
   operational state of a network device, link or network protocol in
   the network.  KPI data is usually represented to users as a set of
   time series

   (e.g., KPI = x_i, i=1..t),

   each time series is corresponding to one network KPI indicator value
   at different time point during specific time period.

3.  Characteristics of network data

   Network data describes the process that information collected from
   various data sources and transmitted to one or more receiving
   equipment for analysis tasks [I-D.ietf-wu-t2trg-network-telemetry].
   Analysis tasks may include event correlation, anomaly detection, risk
   detection, performance monitoring, trend analysis, and other related
   processes.

   Network data is a series of data points indexed in time order.  It
   taken over time may have an internal structure (such as, trend,
   seasonal variation, or outliers).  Trend means that, on average, the
   measurements tend to increase (or decrease) over time.  Seasonality
   means that, there is a regularly repeating pattern of highs and lows
   related to calendar time such as seasons, quarters, months, days of
   the week, and so on.  In regression, outliers are far away from the
   line.  With time series data, outliers are far away from the other
   data.

   Network time series data analysis comprises methods for analyzing
   time series data in order to extract meaningful statistics and other
   characteristics of the data.




Ding, et al.               Expires May 3, 2018                  [Page 3]


Internet-Draft                 WD Use Case                  October 2017


   Network data mainly consists several major characteristics:

   o  Subject: The subject is the object to be measured, and it has
      multiple properties from different dimensions.  An example of a
      wavelength division service performance monitoring scenario is
      that the subject of the measurement is the ' optical module '
      whose attributes may include board name, device name, and so on.

   o  Measured values: A subject may have one or more measured values,
      and each measurement corresponds to a specific indicator.  Take
      the server status monitoring scenario example, the measured
      indicators may have FEC_bef (Forward Error Correction coding
      before error correction), FEC_aft (Forward Error Correction coding
      after error correction), input optical power, output optical
      power, etc.

   o  Timestamp: Each report of the measured value will have a timestamp
      attribute to indicate its time.

4.  Use cases

   The following sections highlight some of the most common wavelength
   division use case scenarios and are in no way exhaustive.

4.1.  Anomaly detection

   In Data Analytics Engine, anomaly detection is the identification of
   items, events or observations which do not conform to an expected
   pattern or other items in data.  Typically the anomalous items will
   translate to some kind of problem, such as optical layer problem.

   For network equipment performance anomalies, multiple features are
   usually extracted from KPI data, such as time, value, frequency,
   etc., and used as the key factors for anomaly analysis.

   Take wavelength division service as an example, collection
   information such as FEC_bef, input optical power, laser bias current
   and other key factors can be selected to keep track of wavelength
   division service over time and calculate the device statistics data
   in a specific time period such as average device downtime in the
   specified time window.  These statistics data can be further used to
   detect wavelength division service anomaly or improve the accuracy
   rate for wavelength division KPI anomaly detection.  In this
   scenario, we do not rely on the manual preconfigured threshold to
   trigger alarm, instead, we automatically detect KPI anomaly in
   advance and raise alarm, as seen in figure 1.





Ding, et al.               Expires May 3, 2018                  [Page 4]


Internet-Draft                 WD Use Case                  October 2017


       +---------+    +----------+    +----------+    +--------+
       | Network |    | feature  |    | anomaly  |    | raise  |
       | data    |+-->| selection+--->| detection|+-->| alarm  |
       +---------+    +----------+    +----------+    +--------+


                        Figure 1: anomaly detection

4.2.  Risk assessment

   In Data Analytics Engine, risk assessment is a component aiming at
   providing an estimation of the overall network risk condition.
   Unlike the anomaly detection component that copes with network faults
   and failure that already happened, risk assessment module's goal is
   to anticipant network event, forecast short term change and risk in
   the network based on the trends of network data (e.g., fast growing,
   fast dropping, slowly increasing, and slowly decreasing of KPI data).
   This opens up a channel to reveal potential network problems or
   locate the need for network optimization and upgrade.

   Network KPIs provide fine-grained understanding of network
   performance, which bring more value to network maintenance and
   operation, including identifying possible bottlenecks, dimensioning
   issues, and locating the need to perform network optimization.  Based
   on the various monitor mechanisms, if any high risk is occurred in
   the network, administrators could be informed at a very early stage.
   The ability to handle large amount of noisy KPI data properly is
   vital to gain these desired insights.

   Given hundreds of thousands of KPI data, it is a challenging issue to
   assess network risk.  Good network risk assessment criteria should be
   indicative of local network-level problems, and hence be able to
   provide prompt warnings and help locate potential problems when
   trivial but persisting anomalies are observed.  Meanwhile, it must
   also describe system performance in a global sense by aggregating
   multi-faceted information with large number of KPIs across the
   network infrastructure.  There are two strategies to design such KPI
   network risk, as shown in figure 2:













Ding, et al.               Expires May 3, 2018                  [Page 5]


Internet-Draft                 WD Use Case                  October 2017


             +---------+    +------------+     +------------+
             | Network |    | single KPI |     | risk       |
             | data    |+-->| scoring    |+--->| assessment |
             +---------+    +-----+------+     +------------+
                                  |                   ^
                                  |                   |
                            +-----v------+            |
                            | multi-KPI  |            |
                            | scoring    +------------+
                            +------------+

                         Figure 2: risk assessment

   1) Single KPI scoring: The scoring strategy for single KPI.  In this
   case, different dimensions of a KPI should be examined to score a
   KPI;

   2) Multi-KPI scoring: The scoring strategy for assessing the network
   risk using values of many KPIs.  If a device or a service is
   monitored by several key KPIs, the risk should be analyzed by the
   integration of these KPI scores.

5.  Data Issues

5.1.  Merge data from different time periods

   In the process of data collection, the collection period of the same
   KPI may be different from each other.  For example, for a multi-
   domain deployment service, there are many different collection
   periods for network devices, such as 30s, 5min, 15min, and so on.

   KPI data collected from different domains is need to be analyzed for
   correlation.  For example, anomaly detection of wavelength division
   service data from different domains is performed, and comparison is
   performed among different domains.  So we need to merge data sets
   from different periods into a integrated data set using metrics in
   the period, such as mean value, peak value or media value.  It then
   raises a question that how these data sets are stored and assessed
   with high efficiency.

6.  Security Considerations

   TBD.








Ding, et al.               Expires May 3, 2018                  [Page 6]


Internet-Draft                 WD Use Case                  October 2017


7.  Conclusions

   TBD.

8.  Normative References

   [I-D.ietf-wu-t2trg-network-telemetry]
              Wu, Q., "Network Telemetry and Big Data Analysis", March
              2016.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", March 1997.

Authors' Addresses

   Xiaojian Ding
   Huawei
   101 Software Avenue, Yuhua District
   Nanjing, Jiangsu  210012
   China

   Email: dingxiaojian1@huawei.com


   Will(Shucheng) Liu
   Huawei
   Bantian, Longgang District
   Shenzhen  518129
   P.R. China

   Email: liushucheng@huawei.com


   Chen Li
   China Telecom
   No.118 Xizhimennei street, Xicheng District
   Beijing  100035
   P.R. China

   Email: lichen@ctbri.com.cn











Ding, et al.               Expires May 3, 2018                  [Page 7]

Document	Document type	Expired Internet-Draft (individual) Expired & archived
	Select version	00
	Authors	ding xiaojian, Will (Shucheng) LIU , Chen Li Email authors
	RFC stream	(None)
	Intended RFC status	(None)
	Other formats	txt pdf bibtex bibxml