Network Working Group                                           Y. Zheng
Internet-Draft                                              China Unicom
Intended status: Informational                                     S. Xu
Expires: September 14, 2017                                     D. Dhody
                                                     Huawei Technologies
                                                          March 13, 2017


           Usecases for Network Artificial Intelligence (NAI)
               draft-zheng-opsawg-network-ai-usecases-00

Abstract

   This document discusses the scope of Network Artificial Intelligence
   (NAI), and the possible use cases that are able to demonstrate the
   advantage of applying NAI.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 14, 2017.

Copyright Notice

   Copyright (c) 2017 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.



Zheng, et al.          Expires September 14, 2017               [Page 1]


Internet-Draft               Usecases of NAI                  March 2017


Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  NAI Architecture  . . . . . . . . . . . . . . . . . . . . . .   3
   3.  NAI Use Cases . . . . . . . . . . . . . . . . . . . . . . . .   3
     3.1.  Traffic Predication and Re-Optimization/Adjustment  . . .   3
     3.2.  Route Monitoring and Analytics  . . . . . . . . . . . . .   4
     3.3.  Multilayer Fault Detection In NFV Framework . . . . . . .   5
     3.4.  Data Center Network Use Cases . . . . . . . . . . . . . .   7
       3.4.1.  Service Function Chaining . . . . . . . . . . . . . .   7
   4.  Contributors  . . . . . . . . . . . . . . . . . . . . . . . .   8
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
   7.  Acknowledgement . . . . . . . . . . . . . . . . . . . . . . .   8
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .   9
     8.2.  Informative References  . . . . . . . . . . . . . . . . .   9
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   9

1.  Introduction

   Current networks have become much more dynamic and complex, and pose
   new challenges for network management and optimization.  For example,
   network management/optimization should be automated to avoid human
   intervention (and thus to minimize the operational expense).
   Artificial Intelligence (AI) and Machine Learning (ML) is a promising
   approach to realize such automation, and can even do better than
   human beings.  Furthermore, the population of Software-Defined
   Networks (SDN) paradigm makes the application of Artificial
   Intelligence in networks possible, since the SDN controller has the
   complete knowledge of the network status and can control behavior of
   network nodes to implement AI decisions.

   AI and ML technologies can learn from historical data, and make
   predictions or decisions, rather than following strictly static
   program instructions.  They can dynamically adapt to a changing
   situation and enhance their own intelligence with by learning from
   new data.  It can learn and complete complicated tasks.  It also has
   potential in the network technology area especially with SDN and
   Network Function Virtualization (NFV).

   This document presents the concept of Network Artificial
   Intelligence.  It first discusses the scope of Network Artificial
   Intelligence (NAI).  And then Some use cases are discussed to
   demonstrate the advantage of applying NAI.






Zheng, et al.          Expires September 14, 2017               [Page 2]


Internet-Draft               Usecases of NAI                  March 2017


2.  NAI Architecture

   The definition of the architecture of NAI could be refer to
   [I-D.li-rtgwg-network-ai-arch].  In the architecture of NAI, central
   controller is the core part of Network Artificial Intelligence which
   can be called as 'Network Brain'.  The Network Telemetry and
   Analytics (NTA) engines can be introduced acompanying with the
   central controller.  The Network Telemetry and Analytics (NTA) engine
   inclues data collector, analytics framework, data persistence, and
   NAI applications.

                   ^                                   ^
                (4)|                                   |(4)
   +---------------|--------------+    +---------------|--------------+
   | Domain 1      |              |    |               |     Domain 2 |
   |        +------------+        |    |        +------------+        |
   |        |  Central   |        |    |        |  Central   |        |
   |     (1)| Controller |----------------------| Controller |(1)     |
   |        |    with    |        |    |        |    with    |        |
   |        |     NTA    |        |    |        |     NTA    |        |
   |        +------------+        |    |        +------------+        |
   |         /          \         |    |         /          \         |
   |     (3)/            \        |    |        /            \(3)     |
   |       /              \       |    |       /              \       |
   | +--------+        +--------+ |    | +--------+        +--------+ |
   | |        |        |        | |    | |        |        |        | |
   | |Network | ...... |Network | |    | |Network | ...... |Network | |
   | | Device |  (2)   | Device | |    | | Device |  (2)   | Device | |
   | |    1   |        |    N   | |    | |    1   |        |    N   | |
   | +--------+        +--------+ |    | +--------+        +--------+ |
   |                              |    |                              |
   +------------------------------+    +------------------------------+

    Figure 1: An Architecture of Network Artificial Intelligence(NAI)

3.  NAI Use Cases

3.1.  Traffic Predication and Re-Optimization/Adjustment

   This subsection introduces the Path Computation Element (PCE)
   [RFC4655] use cases in wide area networks (WAN).  In PCE scenario,
   network data collection is realized through the control plane
   protocols such as PCE protocol (PCEP) and BGP-LS [RFC7752] protocol
   and data are passed to the PCE application.  PCEP receives the state
   of Label Switched Path (LSP) from the network, and BGP-LS receives
   the topology information from the network.  If network telemetry is
   used, traffic information can be received from the network as well
   directly at the NTA engine using protocols such as gRPC.



Zheng, et al.          Expires September 14, 2017               [Page 3]


Internet-Draft               Usecases of NAI                  March 2017


   PCE application (APP) only maintains the latest information.  To
   enable NAI, history of all LSP and topology changes is stored in
   external data repository.  Further traffic monitoring data could also
   be collected and stored, if network telemetry is used.  There are two
   usecases in the application scenarios: (1) reroute/re-optimize using
   the historical trend and predications from AI; (2) traffic congestion
   avoidance and AI-enabled auto-bandwidth adjustment.

   For the usecase (1), the analytics component in NTA (Network
   Telemetry and Analytics), can use stored data to build models to
   predict impact of network events and state of the LSPs.  For example,
   it can use historical trends to guide path computation to include/
   exclude specific links.  Finding correlations between data, finding
   anomalies and data visualization are also possible.

   The analytics component in NTA can also use stored data to detect and
   predict network events and request PCE to take necessary actions.
   For example, it can use network bandwidth utilization historical
   trends to request for re-optimizations.

   For the usecase (2), with network telemetry, the NTA can collect per-
   link and per-LSP traffic flow using gRPC from network.  Such network
   telemetry data includes statistics for tunnels, links, bandwidth
   reservations, actual usage, delay, jitter, packet loss, etc.
   Meanwhile, it also collects data regarding network events and its
   impact on traffic flows.  The analytics component can use telemetry
   data to build traffic models to predict traffic congestion when new
   years or sporting events are coming.  According to the congestion
   prediction, the PCE app could reroute traffic to avoid congested
   links.  Besides the case, NTA can also perform predication and make
   necessary changes to network.  In particular, the PCE APP performs
   bandwidth usage prediction (i.e., bandwidth calendaring) by looking
   at the historical trends of all sampled data instead of the instant
   sampled data.  The collected data are traffic engineering data base
   (TEDB) and LSP-DB, and can also include scheduling information.  In
   addition, the collected data also include auto-bandwidth related
   changes under particular network events.  Using machine learning
   algorithm, the analytics component is able to correct such changes
   with the events, and predicts network events and their impact.

3.2.  Route Monitoring and Analytics

   This subsection introduces the BGP Monitoring Protocol (BMP)
   [RFC7854] use case in wide area networks (WAN).  The BGP protocol is
   known for its flexibility and ability to manage a large number of
   neighbors and routes.  It is also the basis for many overlay services
   such as L3VPN, L2VPN and so on.  The BMP protocol can be used by the




Zheng, et al.          Expires September 14, 2017               [Page 4]


Internet-Draft               Usecases of NAI                  March 2017


   controller to monitor BGP protocol neighbor status and routing
   information on the routers.

   According to [RFC7854], BMP client located in the router collects BGP
   neighbor status, routes for each neighbor, and events defined by the
   user.  And then it passes the informations through the BMP protocol
   to the management station located on the controller.  Based on BMP
   monitoring of BGP, there are three use cases: (1) BGP Route Leaks
   Monitoring; (2) BGP Hijacks Monitoring; (3) Traffic Analytics.

   Route leaks involve the illegitimate advertisement of prefixes,
   blocks of IP addresses, which propagate across networks and lead to
   incorrect or suboptimal routing.  For case (1), based on BMP, NAI
   apps can analyze BGP route leaks.

   For case (2), by manipulating BGP, data can be rerouted in an
   attacker's favor out them to intercept or modify traffic.If the
   malicious announcement is more specific than the legitimate one, or
   claims to offer a shorter path, the traffic may be directed to the
   attacker.By broadcasting false announcements, the compromised router
   may poison the RIB of its peers.After poisoning one peer, the
   malicious routing information could propagate to other peers, to
   other Autonomous Systems, and onto the interactive Internet.  Based
   on monitoring BGP routes, ML algorithms can be trained to determine
   when a hijack has taken place and take necessary actions.

   In case (3), with BMP protocol providing BGP changes, together with
   Telemetry providing network traffic information, The NAI Apps can
   analyze traffic trends, predict traffic changes, and do traffic
   optimizing.

3.3.  Multilayer Fault Detection In NFV Framework

   The high reliability and high availability required for carrier-class
   applications is a big challenge in virtualized and software-based
   environment where failures are normal in a software-based
   environment.  The interdependence between NFV's abstraction levels
   and virtual resources is complex as shown in Fig..  The dynamic
   characteristics of the resources in the cloud environment make it
   difficult to locate the fault.  So multilayer fault detection for NFV
   networks and cloud environment will be very useful.










Zheng, et al.          Expires September 14, 2017               [Page 5]


Internet-Draft               Usecases of NAI                  March 2017


                         +--------------------+
                         |      Central       |
                         |     Controller     |
                         |        with        |
                         |         NTA        |
                         +--------------------+
                             |      |      |
                             |      |      |
                             |      |      |
                             V      V      V
        +-------------------------------------------------------+
        |                                                       |
        |      +-----------+  +-----------+  +-----------+      |
        |      |    VNF1   |  |    VNF2   |  |    VNF3   |      |
        |      +-----|-----+  +-----|-----+  +-----|-----+      |
        |            |              | VN-NF        |            |
        |    +-------|--------------|--------------|-------+    |
        |    | NFVI                                        |    |
        |    | +-----------+  +-----------+  +-----------+ |    |
        |    | |  Virtual  |  |  Virtual  |  |  Virtual  | |    |
        |    | | Computing |  |  Storage  |  |  Network  | |    |
        |    | +-----------+  +-----------+  +-----------+ |    |
        |    | +-----------------------------------------+ |    |
        |    | |        VIRTUALIZATION LAYER             | |    |
        |    | +--------------------|--------------------+ |    |
        |    |                VI-Ha |                      |    |
        |    |+---------------------|---------------------+|    |
        |    ||                         Hardware Resouces ||    |
        |    ||+-----------+  +-----------+  +-----------+||    |
        |    ||| Computing |  |  Storage  |  |  Network  |||    |
        |    |||  Hardware |  |  Hardware |  |  Hardware |||    |
        |    ||+-----------+  +-----------+  +-----------+||    |
        |    |+-------------------------------------------+|    |
        |    +---------------------------------------------+    |
        |                                                       |
        +-------------------------------------------------------+

              Figure 2 NAI in Multi-layer NFV Framework

   For the virtualization layer, CPU performance, memory usage,
   interface bandwidth and other KPI indicators can be monitored.  At
   the same time resource occupancy and the life cycle of NVF software
   process can also be monitored.  Through the NAI, the relevant
   statistical data in multiple levels can be analyzed and the models
   can be setup to locate the root cause for the possible fault in the
   multi-layer environment.





Zheng, et al.          Expires September 14, 2017               [Page 6]


Internet-Draft               Usecases of NAI                  March 2017


3.4.  Data Center Network Use Cases

   Traditionally, data center networks have comprised a large number of
   switches and routers that direct traffic based on the limited view of
   each device.  With help of SDN/NFV the data center networks are more
   agile and dynamic to changing usage and traffic patterns.  The real-
   time traffic data and usage can be used to make the data center
   management and operations intelligent.

   Various protocols such as sFLOW, IPFIX could be used to get the port
   statistics as well as traffic sampling.  Over time this information
   can help build the traffic usage models on a per port and per flow
   basis.  With historical data as the base the NTA engine can predict
   the traffic usage and make necessary instructions to the SDN
   controller or NFV orchestrator.  These instructions could be reroute
   a flow to avoid a congested port or scale-in another switch to share
   load based on the predicted traffic demand.

   The NTA engine should find correlation between the various network
   data to build models and predict the impact of network events,
   congestions, network utilization patters etc.  Further NTA could
   detect anomalies based on the historical patterns and help in root
   cause analysis.  The policy framework can be enhanced to consider the
   analytics.

   NTA engine could also get the usage and health information from the
   Host (servers).  Correlation between this information with the
   information received from network could help in finding security
   flows and anomalies when the information does not match.

3.4.1.  Service Function Chaining

   This sub section introduces how to apply NAI to SFC scenario to
   intelligently reroute/re-optimize the service chains; increase
   utilization for both Service Functions(SF) and network; intelligent
   selection of the Service Function Path (SFP) based on data traffic
   trends.

   As per [RFC7665], Service function chaining (SFC) enables the
   creation of composite (network), services that consist of an ordered
   set of SFs that must be applied for specific treatment of received
   packets and/or frames and/or flows selected as a result of
   classification The SFs of chain are connected using a service
   function forwarder (SFF), which is responsible for forwarding traffic
   to one or more connected SFs according to information carried in the
   SFC encapsulation, as well as handling traffic coming back from the
   SF.




Zheng, et al.          Expires September 14, 2017               [Page 7]


Internet-Draft               Usecases of NAI                  March 2017


   The various network telemetry information like delay, jitter, packet
   loss from the network and the CPU/memory usage utilizations from the
   SFs, can be collected using sFLOW/gRPC protocol and stored in
   persistent data repository.  The analytics component in NTA can use
   stored data to build statistics models to predict the impact on
   various Service Function Paths due to network events, traffic and
   state of the SFPs and instruct the SDN controller to take necessary
   actions SDN controller can calculate new paths/reroute the SFC path
   to avoid congested Ports/SFFs or overloaded SFs.  This correlation of
   application analytics from the SFs and the network analytics from the
   SFFs could enhance the intelligent management of the service chains
   for the operators.

   The usage and traffic pattern over time can help increase the
   utilization of SF as well as the underlay network.

4.  Contributors

   The following people have substantially contributed to the usecases
   of NAI:

   Lizhao You
   Huawei
   Email: youlizhao@huawei.com

   Kalyankumar Asangi
   Huawei
   Email: kalyana@huawei.com

5.  Security Considerations

   TBD

6.  IANA Considerations

   This document has no actions for IANA.

7.  Acknowledgement

   Thanks to Li Zhenbin and Liu Shucheng for their comments and
   contribution.

8.  References








Zheng, et al.          Expires September 14, 2017               [Page 8]


Internet-Draft               Usecases of NAI                  March 2017


8.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <http://www.rfc-editor.org/info/rfc2119>.

8.2.  Informative References

   [I-D.li-rtgwg-network-ai-arch]
              Li, Z. and J. Zhang, "An Architecture of Network
              Artificial Intelligence(NAI)", draft-li-rtgwg-network-ai-
              arch-00 (work in progress), October 2016.

   [RFC4655]  Farrel, A., Vasseur, J., and J. Ash, "A Path Computation
              Element (PCE)-Based Architecture", RFC 4655,
              DOI 10.17487/RFC4655, August 2006,
              <http://www.rfc-editor.org/info/rfc4655>.

   [RFC7665]  Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
              Chaining (SFC) Architecture", RFC 7665,
              DOI 10.17487/RFC7665, October 2015,
              <http://www.rfc-editor.org/info/rfc7665>.

   [RFC7752]  Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and
              S. Ray, "North-Bound Distribution of Link-State and
              Traffic Engineering (TE) Information Using BGP", RFC 7752,
              DOI 10.17487/RFC7752, March 2016,
              <http://www.rfc-editor.org/info/rfc7752>.

   [RFC7854]  Scudder, J., Ed., Fernando, R., and S. Stuart, "BGP
              Monitoring Protocol (BMP)", RFC 7854,
              DOI 10.17487/RFC7854, June 2016,
              <http://www.rfc-editor.org/info/rfc7854>.

Authors' Addresses

   Yi Zheng
   China Unicom
   No.9, Shouti Nanlu, Haidian District
   Beijing  100048
   China

   Email: zhengyi39@chinaunicom.cn







Zheng, et al.          Expires September 14, 2017               [Page 9]


Internet-Draft               Usecases of NAI                  March 2017


   Xu Shiping
   Huawei Technologies
   Huawei Bld., No.156 Beiqing Rd.
   Beijing  100095
   P.R. China

   Email: xushiping7@huawei.com


   Dhruv Dhody
   Huawei Technologies
   Divyashree Techno Park, Whitefield
   Bangalore, Karnataka  560066
   India

   Email: dhruv.ietf@gmail.com



































Zheng, et al.          Expires September 14, 2017              [Page 10]