Minutes IETF100: nmrg

Meeting Minutes Network Management (nmrg) RG
Title Minutes IETF100: nmrg
State Active
Other versions plain text
Last updated 2017-12-05

Meeting Minutes

* Links
        - Agenda https://datatracker.ietf.org/doc/agenda-98-nmrg/
        - Materials https://datatracker.ietf.org/meeting/100/session/nmrg
        - Recordings http://ietf100.conf.meetecho.com/index.php/Recordings#NMRG
          and on Youtube https://www.youtube.com/user/ietf/videos

* Session 1
  Monday, November 13th, Afternoon session III, 17:40 - 18:40 (1 hour)

55 particpants

        Scribe(s) and notetaker(s): Thank you Jéferson and Giovane!
        Agenda bashing

1.a) Intent Based Network Management (IBNM)
        . Distinguishing Intent, Policy, and Service Models (10 min.),
        presenter: Alexander Clemm
                MIC Toerless: no definition. scope it and be smaller. change
                wrt. existing document (RFC7575). Wiat for the term to
                stabilize. MIC Sabine: build some bridges between what has been
                proposed some years ago in NFVRG (connectivity), focusing on
                how to design an intent grammar nad nby diong that specifying
                the scope of intent. Alex: yes. link to prior work. intent
                grammar +/- out of scope of this draft. MIC Toerless: not a try
                to define anything called intent. Alex: not in the corpus of
                the document but references to extensive literature

        . Concepts of Network Intent (10 min.), presenter: Mouli Chandramouli
                MIC Dean: congested link. well-known problem. what other
                problem are in the scope? Mouli: intent is ambiguous. need to
                resolve the ambiguity somehow. open problem. MIC Dean: was
                asking for other use cases MIC Diego: what the goal of these
                documents MIC Jabber/Mouli: one area of research topic that we
                want to pursue is network intent. only initial outline
                presented now. use cases was mentioned; actual customer use
                cases would be useful

        . Discussion and Next steps (15 min.)
                MIC Dean: network management work flows from operators, eg.
                peering, how they translate services into logical/physical
                topologies replace a chassis router with physical devices in
                different topologies (CLOS), overlays, dimensioning MIC
                Expedia: define clearly of the intents. NLP, AI, ... can be
                used but don't help if not have a clear problem definition.
                real case. going to the stage of writing in common language can
                make things worst. AI can help when intent is known (load
                balance, anomaly) MIC Toerless: fix terminology. get rid of the
                confusion on the semantics of intent. MIC Mouli: digital
                assistant (Alexa, Echo, Siri). Two persons ask same question
                differently. Look from a networking perspective. MIC Diego:
                other term "slice". ok with the work on intent. becoming the
                interpretation of IRTF/IETF. important to build/define schema,
                information model. how intent, formal description MIC Dean:
                cart in front of the horse. first terminology. then use cases
                (by operators). business/technical SLA. get that from the
                network expressed in simple manner.

                Summary of main discussion points:
                        There are multiple definitions and domains using the
                        term intent. There are different approaches and means,
                        different levels/layers, different "producer/consumer".
                What is the common denominator: intent as an abstraction.
                abstracting the complexity of the underlying layer / network.

                So what should the RG do?
                Lisandro summarizing the work plan as:
                        -How to call stuff: --> document(s) on
                        terminology/taxonomy -How to express/model intent:
                        information model, grammar, languages -How to realize
                        Intent Based Network Management System: "reference"
                        functional architecture, functional blocks/components,
                        mechanisms/techniques, theory of operations/lifecycle

                Additonal work items could also investigate: functions and
                techniques ; use cases ; what are the challenges and
                corresponing, research items.

1.b) NMRG in 2017, 2018 and beyond (15 min.)
        . 2017 RG meetings and progress of work
                + 4 meetings:
                        - IETF98, Chicago, March. Topic: Autonomic Networks 2.0
                        - IM2016, Lisbon, May. Topic: New research items
                        - IETF99, Prague, July. Topic: General meeting, and
                        Workshop on Measurement-Based Network Management -
                        IETF100, Singapore, November. Topic: Intent Based
                        Network Management, and Use of artificial intelligence
                        (AI) for network management

                + RG document
                        Autonomic Networking Use Case for Distributed Detection
                        of SLA Violations
                        Just passed IRSG poll. Next step: IESG conflict review.

                + 4 active (individual) documents
                + 1 document not published after IESG conflict review.

        . 2018 RG meetings plan
                *** Tentative list to be discussed ***
                + IETF 101, London, March
                + NOMS 2018, Taiwan, April
                + IETF 102, Montreal, July
                + IETF 103, TBA, November
                + Interim and topical meetings with other groups/events. Open
                for suggestions.

        . Evolution of NMRG
                + Research agenda organized around “themes”
                        - IBNM, AI techniques, Autonomics 2.0… Other topics of
                        interest and volunteers to progress the work? - Derive
                        research items, work plan and milestones

                MIC Giovanne: collaborate with MAPRG

                + Do we need a new/revised charter?

                MIC Diego: ok with the way forward. don't think need to change
                the charter. MIC Alex: good ideas. some themes. not necessary
                to recharter.

* Session 2
  Tuesday, November 14th, Morning session I, 9:30 - 12:00 (2.5 hours)
  Special session on the Use of Artificial Intelligence (AI) for Network

57 participants

        Scribe(s) and notetaker(s): Thank you Jérôme and Giovane!
        Agenda bashing

2.a) Scope and Objectives (5 min.), presenters: NMRG chairs

2.b) Use cases and research results. (~70 min.)
        . A deep-reinforcement learning approach for software-defined
        networking routing optimization (10 min.), presenter: Albert Cabellos
                pros and cons of DRL
                lack of explainability -> develop Explainbale Artificial
                Intelligence reward function == network management policy MIC
                Jabber/Stenio Fernandes: Have you tested your DRL model against
                adversarial agents?
                        Albert: No
                MIC Alex Clemm: How the reward function relates to policy
                definition ? How is this defined ?
                        Albert: The reward function is a continuous function.
                        The agent tries to increase teh reward function. The
                        reward function should be the distance between the
                        current performance of the current and the performance
                        you want to achieve
                MIC ?: Is it proper to SDN cases?  Do you have vertical agents ?
                        Albert: Only one agent on top of the SDN controller. No
                        vertical agent.
                MIC ?: What kind of routing algorithms are you using, is it
                deterministic ?
                        Albert: Yes it is, The output cannot put "give me the
                        CLI configuration". The agent will choose the weights
                        of the links like OSPF and then routers chooses the
                        paths with the light weight.

        . Use of CVAE for Network QoS Management (10 min.), presenter: Shen Yan
                pros and cons of CVAE
                MIC Yansen: CVAE could help DRL for the training phase
                MIC Albert: You predict what will be the QoS given the current
                traffic matrix, like a simulator ?
                        Yansen: Yes.

        . Network traffic analysis for encrypted traffic and security
        monitoring (10 min.), presenter: Jérôme François
                MIC/Jabber Stenio Fernandes: question: in your dataset, how
                balanced (or imbalanced) is the target class?
                MIC Yansen: how relationship between ...?
                        Jerome: ?
                MIC/Jabber Xiaojin: slide 9
                        Jerome: classical features from SOTA. and Full feaures:
                        SOTA + new features on data, and Selected features
                MIC Dean: sample rates, applicable on-line? more for forensic
                        Jerome: near real-time (HTTPS firewall), once HTTPS
                        session is finished.
                MIC Olivier Marjoux: supervised or un- or both
                        Jerome: signature, HTTPS is supervised. TDA is
                MIC Kireeti: how well you can classify?

        . Model-free Resource Management of Cloud-based applications using
        Reinforcement Learning (10 min.), presenter: Armen Aghasaryan
                MIC Dean: what topologies used to test the model?
                        Armen: simple web applications, parallel load balancers.
                        Dean: abnormal situations can be topology-specific. The
                        environment parameters may be dependent to topologies
                        Armen: avoid the algorithm to perform on abnormal
                        situations, use other approaches. You don't want to
                        learn which policies to apply when there is an
                        anormality. You want to adapt the normal situation
                MIC ?/Ericsson: slow convergence, and traffic changes faster
                than algorithm convergence. Can you give us what feeling about
                convergence time ? what happens if the traffic change faster ?
                        Armen: if input traffic has no stability patterns, then
                        no convergence. other approach: maintain Markovian
                        properties, then convergence can be achieved. No
                        guarantee about that. To converge, you should assume
                        that you'll reach a stationarity. The assumption is
                        that the system behaves like a Markovian model.
                        Deviations have to be compliant with this Markovian
                        property. Otherwise, we cannot claim any convergence.

        . COGNET project: double closed-loop architecture, and dataset
        generation (10 min.), presenter: Diego Lopez
                MIC Jabber Stenio Fernandes: how big are these datasets?
                        Diego: in general, 3 days of traffic. 1-2 TBo
                MIC Albert: Did you decide to learn the model offline ?
                        Diego: initial training is off-line, and then we
                        updated online.
                MIC: troubleshooting for VNFs should also be an application ?
                        Diego: Yes. It has been demonstrated by Nokia "noisy
                        neighbors", as partners of the CogNet project.

        . Q&A and discussion (20 min.)

2.c) Emerging landscape of AI in Networks (~50 min.)
        . IEEE ComSoc ETC on Network Intelligence (10 min.), presenter: Laurent
        Ciavaglia . ETSI ISG Experiential Network Intelligence (ENI) 10 min.),
        presenter: Will Liu
                MIC Yansen: frequency of operations (rate of data collection).
                MEF Open Issues answer regarding collection periods: align the
                data for different cope: it depends of the frequency of the
                        Will: If operation once per second: the higher
                        frequency data will be useless.
                MIC Yansen: ?

        . IETF/IRTF Intelligence Driven Networks (10 min.), presenter: Sheng
                MIC ?: There are a lot of research in ML, also in network
                management like VNF placement among others. This important
                topic for which we need data. MIC ?: need the data right, and
                control messages to the network.
                        Sheng: - Sheng: I would to have more data in a
                        standardized format.
                MIC Lisandro: NMRG is a research group. Identify what can be
                standardized or not is an important outcome. The
                standardization aspect is not the primary end goal.

        . Q&A and discussion (20 min.)

2.d) Discussion, conclusions and way forward (~20 minutes)
        . What have we learned? What’s important for the next steps?
                + Research items
                + Standardization path(s)
        . How do we structure the work to be done
                + Collective roadmap, work plan, deliverables

        Diego: importance of having data. An essential tool is to have data.
        Dependent of the data you feed the technology will depend the result.
        Question about (availability, feasibility of) open dataset? Laurent: in
        other domain, there are reference databases (like in image or speech
        recognition). Can imagine the same for to have a reference database for
        "networking"? Is it a viable approach? or should we rahter seek the
        properties of the data we need what are the specific properties these
        data should have? Diego: we have old databases, have new ones and keep
        them updated. the goal should be similar than in other domains. The
        idea would be we should have to keep those data keep them updated
        according to the evolution but it is very challenging (data is an
        asset). having a set (even not necessarily complete) is important for
        reproducible research. Jeferson: based on our previous experience with
        Anima WG, a good push should a use case meeting in order to set a draft
        (minimum set of use cases). This will give some material to push in the
        direction of IA and IDN. Sheng: use cases are important. some already
        collected in IDNET. compare use cases, gap analysis. identify from this
        future standardization work. Use cases are actually very important and
        we collect several ones in the ML. Need gap analysis to know if
        something is missing in this use cases. ?: data is important.
        methodologies are also important. no reference practice. Data is
        important but methodology as well. Traffic classification:
different ways to classify, what insights are needed?
        Lisandro: to sum up. In the past, SNMP datasets shown SNMPv3 was not
        used as expected. Need an infrastructure for sharing then willingness
        to share. Luyuan: focusing on the data is right. open config. real-time
        data streaming. how to handle the massive data. (in-memory processing?)
        We need real-time data. In top of that you can do your AI. How to
        handle the massive data? We need to scale it to reach acceptable
        processing time. Compared to other areas which used fixed, our data is
        more distributed. Jeferson: importance of use case with default
        structure to compare/document. Albert: data sets important. but also an
        asset. public dataset used for testing/benchmarking. not for training.
        Luyuan: commercial/production data will not be available. Diego: agree.
        also unsupersived methods in running networks seems not applicable in
        operational networks Luyuan: In our domain, AI is very distributed. One
        methodology wil nor solve all problems. You have to target one problem
        at a time. Stenio: It is impossible to have data, organize a
        competition similar to Kaggle