Networking Machine Learning Proposed Research Group
IETF96, Berlin, Germany
Co-chairs: Albert Cabellos & Sheng Jiang
Monday Afternoon Session II, 15:40-17:40, Potsdam II

*******************************
1. WG Agenda Bash, by co-chairs

****************************************************************
2. Security monitoring in Internet: the use case of Phishing,
by Jerome Francois

[David Meyer]: Q1: what's the assumption underline your model, 
      e.g. are you assuming the composite of ULR is changeable?
      Q2: Did you consider using statistical classifying on
      categorical data?
[Jerome]: Q1: the main assumption is that where the compositions
      of phishing URL are not all related. At least not the same
      relation as the semantic point of view. Q2: we did not
      test.

[Giovane]: Phisher registration usually very cheap, so not
      necessarily mean TLD. We have different datasets, maybe we
      can collaborate.

[Vijay Gurbani]: got 99% accuracy by combining some of the
      classifiers, that's not the case here?
[Jerome]: No. We got 99% just by taking the value more
      confidence for.
[Vijay Gurbani] Did you entertain the idea of a combination
      of more classifiers?
[Jerome] Not yet investigated on this.
[Vijay] It might be interesting to look at C5.0 if you can
      combine it in anyway. One last question: I was surprised
      to see the SVM result almost towards the low end. Is that
      just a straight SVM, or you just tuned it?
[Jerome] First we analyze one parameter, and then make it more
      promising...

[Sheng] Q1: is it possible to set up some autonomic follow up,
      such as autonomic filter based on the result?  Q2: how
      much your solution is "generic"? For "generic", I mean,
      say if you change the network, how much you'll need to
      change the solution.
[Jerome] Q1: origin of this work is to get kind of plug-in to
      give some advise to the users. We cannot really block,
      because I think the URL phishing analysis is faults
      positive. And the analysis latency not fit the online
      users.

****************************************************************
3. Applying Machine Learning Mechanism with Network Traffic
(draft-jiang-nmlrg-network-traffic-machine-learning), by Bing Liu

[Doug Montgomery]: Do you have any ambitions of this draft to
      make some gap analysis. E.g. I'm interested in flow export,
      say, whether Netflow provides enough information for ML. 
      So it could be more valuable if this draft outlines something
      e.g. requirements for other protocols to do ML.
[Bing] The bottom line is to show the results of current research
      in the industry on utilizing the ML into network traffic.
      But I agree it could be more valuable if there is some gap
      analysis. 
[Doug] That may be followed on a dedicated draft.
[Sheng] Yes. The current draft is mainly for what we received in
      last meeting. It is still on primary stage to just collect
      the examples and some analysis on these examples. What you
      suggested definitely requests more input. Otherwise, we may
      not do such wide analysis.
[Doug] Maybe it is a longer time goal of this research group.
[Sheng] Exactly. That is definitely in the scope and in our mind
      to do it in the future.

[Lars Egget]: Current "Security Considerations" is almost saying
      nothing to worry about. It's incorrect. There are certainly
      considerations on data privacy issue.
[Sheng] We made a mistake. We should leave it as a "TBD" rather
      than saying nothing to worry.
[Lar Eggert]: It's a 00 draft, TBD would be way better than
      saying nothing to worry about.
[Bing] Got it.

[David Meyer]: 1. I second Lar's comment. There might be cases
      that poison the classifier, you might need to look into
      this. 2. Terminology issues. The word in the slides--
      "measureable" in ML usually uses "continuous" 
      "dis-continuous"; and the word below "measureable", in ML
      is "categorical". 3. "Architecture considerations 3/3",
      ensemble learning usually refers to something different,
      it could be applicable in a centralized or distributed
      learning environment for a kind of averaging models.
      4. "some models have limited accuracy", I think it's
      really a bold statement that we're not gonna state
      accuracy, because we don't know that. And it's kind of
      speculation.
[Bing] Both the terminology and wording accuracy need to be
      improved. Thanks.

[Albert]: 1. second David's comment, we don't state "some models
      have limited accuracy".  2. I don't agree with the
      "control loop" section, ML is right now useful for some
      mission critical applications. 3. Maybe in another draft,
      identify the features that all use cases are using, and try
      to see if there is any common ground maybe we can provide
      guidelines to others protocols that which features are the
      most common ones useful. 
[Bing] I basically agree that in current state of art ML is mostly
      for assistant for human decision. It's not reliable and even
      dangerous for machine decision itself. But there might be
      possibility that in some use cases the accuracy is
      sufficient enough to do the autonomic decision.
[Albert]: We have several ??, those are mission critical systems.
[Sheng]: I guess we don't agree what you said here "only for a
      small set of use cases". We don't know that. The ML is
      developing itself. 
[Bing]: One potential area is that, from our study experience,
      we found some network configurations are actually not
      accurate configurations. E.g. in routing protocols, the
      cost of the link. It only needs two distinguished numbers
      rather than some specific numbers. This kind of use cases
      might be applicable for ML.

[Lars]: The slides list many unencrypted parameters. If we fast
      forward, there might be more encrypt traffic in the future.
      Thus, we'll miss many of the features. In that case, can we
      still do something? 
[Sheng] 1. Sure. In last meeting, we had a HTTPS traffic analysis
      use case from Jerome. 2. And second your previous comment.
      Besides a "Security Considerations" section, we might need
      another "Privacy Considerations" section, too.

[Vijay]: slide 10, not to hammer this anymore. As far as
      terminology’s concern, I've seen literature for ensemble
      learning referring to multiple hypotheses for the same
      classifier. But here I think what you trying to say is you
      have different classifiers and you have to somehow combine
      the results. If you do it in real time in a router, the
      latency from different classifiers has to be taken into
      account. Otherwise you'll gonna blow up the queue and
      that's not very good.
[Bing]: Ensemble learning in this slides specifically means
      maybe not different classifiers, maybe the same
      classifier, but the dataset could be separate and learnt
      by multiple routers.
[Vijay]: Yes, ensemble different observations use either the
      same hypotheses or different hypotheses on it. But anyway,
      the thing you need to keep in mind here is, if you do
      ensemble in the same router or you do parts from other
      routs impacts how you process decision of the packets.

[David Meyer]: "control loop", in the near future, we'll gonna
      learn that, rather than right now we have workflow here.
      The network control is basically business logic that sb.
      wrote. The ietf control would be much more general than
      we discussed here. Encourage everybody to look at AlphaGo,
      the decision policy network. The policy network is the
      control. So we need to think about it more widely as that
      way too.

****************************************************************
4. Mobile network state characterization and prediction, 
by Voula Vassaki.

Voula was dropped off due to the bad remote voice quality.

****************************************************************
5. Open discussion on the potential standard dataset,
leading by Alber Cabellos

[Lars]: It's not a Working Group, we're not standardizing
      anything. [Albert]: Potential... [Lars]: Not potential,
      it's a Research Group.
[Sheng]: I guess what Albert said is whether we're able to
      standardize some dataset which is not in IETF standard.
[Lars]: You can say that you try to all agree to share a dataset
      for experimentation. But don't use the word "standard",
      especially if you say Working Group at the bottom of the
      slide.

[Brian Trammell]: The idea of having public network traffic
      dataset is scary...
[Sheng]: We're actually not only talking about network traffic.
[Brian Trammell]: Yes, I know..., one of the valuable things the
      Research Group could do is to have access to the datasets
      that already exist with a view tool using them as ML.
      Privacy implication of the traffic dataset; and it is
      difficult for the Research Group to judge what is "good"
      dataset. Guidlines of what type of dataset, what type of
      algorithms, what type of problems are useful. If the
      charter and the statement of the RG is that, it could be
      useful.

[Vijay]: 1. What we get here is specifically for network ML,
      like flows instances. Then raise the question that many
      datasets companies might want to release typically is
      "good" dataset. But for supervised learning, you need at
      least a reasonable amount of "bad" data. 2. Some companies
      operate large networks maintain a lot of KPIs: CPU usage,
      number of packets dropped/queued. I haven't seen any
      public dataset has that type of information. That could be
      used to correlate with the flow data as well.
[Albert]: Fully agree. I'm not aware of any dataset with that
      kind of information.
[Vijay]: If you want to standardize something, you need to scope
      it. And we also need to see there may be some companies or
      where we can get different type of datasets.

[Elaine Newton]: when technology is less mature, public dataset
      are good for researchers. As mature, the test on public
      dataset could be tuned or gained. So "serquest?" dataset
      might be what you want to end up with.
[Albert]: for "serquest" data, you're allowed to run test with
      the data, but you're not allowed to see the data? 
[Elaine]: Yes.
[Sheng]: you mentioned public dataset could be useful for
      immature technology. So far, we're not sure how mature
      the ML is for network. Maybe public data for immature
      stage, and real data for mature stage.

[David Meyer]: the incredible ML products won't happen without
      dataset; we need "open" dataset, thus we don't need to
      spend much time to generate data; image recognition has
      datasets, for network we don't have that, every solution
      is one-off.
[Sheng] The diversity of the tasks in the network ML area. We
      have many different tasks, maybe thousand, distinct from
      each other. As pioneer, we should first agree on the most
      important use cases. Afterwards, we might able to get some
      mechanism generalized.
[David Meyer]: for image ML, everything is ad-hoc at the
      beginning, but now it's not that different. For network,
      we're way behind what it is, I believe we can get there.

[Doug]: I'd like to suggest that maybe jumping to datasets is
      one step too far. You need backup to think about for
      specific use cases or technologies, how you want to test
      it or benchmark it. Tera bytes data you download might
      not contain the data you need.
[Sheng] Maybe the next step of the RG, is to agree on a few
      use cases and the existing datasets which we can assess.

[Steven Wright]: I don't think at current stage, a single
      dataset would be really useful. We start figuring out
      what problems we're trying to solve. Use case is really
      critical. A lot of ML, is business logic, not traditional
      network sense. A life cycle view of the handling of
      dataset might be the most useful thing at current stage.
[Brian Trammel] ...??


*****************************************************************
6. Open discussion on the future direction of NMLRG,
leading by Sheng Jiang

[Vijay]: suggest to invite researchers collaborate with SP at
      every IRTF.
[David Meyer]: we don't have "serquest?" data, why not? ML
      community has long tradition of openness. Both open data,
      open source, and generally open science. 
[Sheng]: I encourage that. Maybe in the future we could set up
      NMLRG Hackthon, like that.
[Lee Howard]: If you get algorithm/hypotheis works great against
      the clean-up public data, then running on the "serquest?"
      dataset, you'll get much more robust results.
[David Meyer]: I totally disagree that algorithms get more
      robust. Openness gets more robust. 
[Lee]: I work for a operator, I can let you test on the dataset,
      but I cannot release the data.
[Sheng]: Both you and David have the point. But the way is first
      we may work on the public open data,
[Lee]: exactly the suggestion, starting from public data, you
      get better and better algorithms, work with dirty and
      dirty data. 
[Sheng]: And the algorithm should be open source at the very
      beginning. In that case, you can take that in your network,
      as well as other providers.
[Lee]: I have no objection on open source algorithms in code,
      but that's not a requirement from me as an operator. Back
      to the co-located comments, operators know networks better,
      it will be always interesting IRTF co-locate with Operator
      groups.
[Steven Wright]: about the scope, where do you draw the dividing
      line of network ML and ML in network context?
[Brian Carpenter]: that's a very ambitious program. Be cautious
      about launching too many things in parallel.
[Lars Eggert]: I won't decide this RG right after this meeting.
      Probably after Seoul meeting, before I step down, which
      is March next year. I've seen lots of interested potential
      here; I also think there is still a lot of confusion here.
      I like the original slide of what the RG will do. The
      longer term future works makes me confused. It seems we can
      do anything. 

Meeting adjourned, see you in Seoul.