Networking Machine Learning Proposed Research Group IETF96, Berlin, Germany Co-chairs: Albert Cabellos & Sheng Jiang Monday Afternoon Session II, 15:40-17:40, Potsdam II ******************************* 1. WG Agenda Bash, by co-chairs **************************************************************** 2. Security monitoring in Internet: the use case of Phishing, by Jerome Francois [David Meyer]: Q1: what's the assumption underline your model, e.g. are you assuming the composite of ULR is changeable? Q2: Did you consider using statistical classifying on categorical data? [Jerome]: Q1: the main assumption is that where the compositions of phishing URL are not all related. At least not the same relation as the semantic point of view. Q2: we did not test. [Giovane]: Phisher registration usually very cheap, so not necessarily mean TLD. We have different datasets, maybe we can collaborate. [Vijay Gurbani]: got 99% accuracy by combining some of the classifiers, that's not the case here? [Jerome]: No. We got 99% just by taking the value more confidence for. [Vijay Gurbani] Did you entertain the idea of a combination of more classifiers? [Jerome] Not yet investigated on this. [Vijay] It might be interesting to look at C5.0 if you can combine it in anyway. One last question: I was surprised to see the SVM result almost towards the low end. Is that just a straight SVM, or you just tuned it? [Jerome] First we analyze one parameter, and then make it more promising... [Sheng] Q1: is it possible to set up some autonomic follow up, such as autonomic filter based on the result? Q2: how much your solution is "generic"? For "generic", I mean, say if you change the network, how much you'll need to change the solution. [Jerome] Q1: origin of this work is to get kind of plug-in to give some advise to the users. We cannot really block, because I think the URL phishing analysis is faults positive. And the analysis latency not fit the online users. **************************************************************** 3. Applying Machine Learning Mechanism with Network Traffic (draft-jiang-nmlrg-network-traffic-machine-learning), by Bing Liu [Doug Montgomery]: Do you have any ambitions of this draft to make some gap analysis. E.g. I'm interested in flow export, say, whether Netflow provides enough information for ML. So it could be more valuable if this draft outlines something e.g. requirements for other protocols to do ML. [Bing] The bottom line is to show the results of current research in the industry on utilizing the ML into network traffic. But I agree it could be more valuable if there is some gap analysis. [Doug] That may be followed on a dedicated draft. [Sheng] Yes. The current draft is mainly for what we received in last meeting. It is still on primary stage to just collect the examples and some analysis on these examples. What you suggested definitely requests more input. Otherwise, we may not do such wide analysis. [Doug] Maybe it is a longer time goal of this research group. [Sheng] Exactly. That is definitely in the scope and in our mind to do it in the future. [Lars Egget]: Current "Security Considerations" is almost saying nothing to worry about. It's incorrect. There are certainly considerations on data privacy issue. [Sheng] We made a mistake. We should leave it as a "TBD" rather than saying nothing to worry. [Lar Eggert]: It's a 00 draft, TBD would be way better than saying nothing to worry about. [Bing] Got it. [David Meyer]: 1. I second Lar's comment. There might be cases that poison the classifier, you might need to look into this. 2. Terminology issues. The word in the slides-- "measureable" in ML usually uses "continuous" "dis-continuous"; and the word below "measureable", in ML is "categorical". 3. "Architecture considerations 3/3", ensemble learning usually refers to something different, it could be applicable in a centralized or distributed learning environment for a kind of averaging models. 4. "some models have limited accuracy", I think it's really a bold statement that we're not gonna state accuracy, because we don't know that. And it's kind of speculation. [Bing] Both the terminology and wording accuracy need to be improved. Thanks. [Albert]: 1. second David's comment, we don't state "some models have limited accuracy". 2. I don't agree with the "control loop" section, ML is right now useful for some mission critical applications. 3. Maybe in another draft, identify the features that all use cases are using, and try to see if there is any common ground maybe we can provide guidelines to others protocols that which features are the most common ones useful. [Bing] I basically agree that in current state of art ML is mostly for assistant for human decision. It's not reliable and even dangerous for machine decision itself. But there might be possibility that in some use cases the accuracy is sufficient enough to do the autonomic decision. [Albert]: We have several ??, those are mission critical systems. [Sheng]: I guess we don't agree what you said here "only for a small set of use cases". We don't know that. The ML is developing itself. [Bing]: One potential area is that, from our study experience, we found some network configurations are actually not accurate configurations. E.g. in routing protocols, the cost of the link. It only needs two distinguished numbers rather than some specific numbers. This kind of use cases might be applicable for ML. [Lars]: The slides list many unencrypted parameters. If we fast forward, there might be more encrypt traffic in the future. Thus, we'll miss many of the features. In that case, can we still do something? [Sheng] 1. Sure. In last meeting, we had a HTTPS traffic analysis use case from Jerome. 2. And second your previous comment. Besides a "Security Considerations" section, we might need another "Privacy Considerations" section, too. [Vijay]: slide 10, not to hammer this anymore. As far as terminology¡¯s concern, I've seen literature for ensemble learning referring to multiple hypotheses for the same classifier. But here I think what you trying to say is you have different classifiers and you have to somehow combine the results. If you do it in real time in a router, the latency from different classifiers has to be taken into account. Otherwise you'll gonna blow up the queue and that's not very good. [Bing]: Ensemble learning in this slides specifically means maybe not different classifiers, maybe the same classifier, but the dataset could be separate and learnt by multiple routers. [Vijay]: Yes, ensemble different observations use either the same hypotheses or different hypotheses on it. But anyway, the thing you need to keep in mind here is, if you do ensemble in the same router or you do parts from other routs impacts how you process decision of the packets. [David Meyer]: "control loop", in the near future, we'll gonna learn that, rather than right now we have workflow here. The network control is basically business logic that sb. wrote. The ietf control would be much more general than we discussed here. Encourage everybody to look at AlphaGo, the decision policy network. The policy network is the control. So we need to think about it more widely as that way too. **************************************************************** 4. Mobile network state characterization and prediction, by Voula Vassaki. Voula was dropped off due to the bad remote voice quality. **************************************************************** 5. Open discussion on the potential standard dataset, leading by Alber Cabellos [Lars]: It's not a Working Group, we're not standardizing anything. [Albert]: Potential... [Lars]: Not potential, it's a Research Group. [Sheng]: I guess what Albert said is whether we're able to standardize some dataset which is not in IETF standard. [Lars]: You can say that you try to all agree to share a dataset for experimentation. But don't use the word "standard", especially if you say Working Group at the bottom of the slide. [Brian Trammell]: The idea of having public network traffic dataset is scary... [Sheng]: We're actually not only talking about network traffic. [Brian Trammell]: Yes, I know..., one of the valuable things the Research Group could do is to have access to the datasets that already exist with a view tool using them as ML. Privacy implication of the traffic dataset; and it is difficult for the Research Group to judge what is "good" dataset. Guidlines of what type of dataset, what type of algorithms, what type of problems are useful. If the charter and the statement of the RG is that, it could be useful. [Vijay]: 1. What we get here is specifically for network ML, like flows instances. Then raise the question that many datasets companies might want to release typically is "good" dataset. But for supervised learning, you need at least a reasonable amount of "bad" data. 2. Some companies operate large networks maintain a lot of KPIs: CPU usage, number of packets dropped/queued. I haven't seen any public dataset has that type of information. That could be used to correlate with the flow data as well. [Albert]: Fully agree. I'm not aware of any dataset with that kind of information. [Vijay]: If you want to standardize something, you need to scope it. And we also need to see there may be some companies or where we can get different type of datasets. [Elaine Newton]: when technology is less mature, public dataset are good for researchers. As mature, the test on public dataset could be tuned or gained. So "serquest?" dataset might be what you want to end up with. [Albert]: for "serquest" data, you're allowed to run test with the data, but you're not allowed to see the data? [Elaine]: Yes. [Sheng]: you mentioned public dataset could be useful for immature technology. So far, we're not sure how mature the ML is for network. Maybe public data for immature stage, and real data for mature stage. [David Meyer]: the incredible ML products won't happen without dataset; we need "open" dataset, thus we don't need to spend much time to generate data; image recognition has datasets, for network we don't have that, every solution is one-off. [Sheng] The diversity of the tasks in the network ML area. We have many different tasks, maybe thousand, distinct from each other. As pioneer, we should first agree on the most important use cases. Afterwards, we might able to get some mechanism generalized. [David Meyer]: for image ML, everything is ad-hoc at the beginning, but now it's not that different. For network, we're way behind what it is, I believe we can get there. [Doug]: I'd like to suggest that maybe jumping to datasets is one step too far. You need backup to think about for specific use cases or technologies, how you want to test it or benchmark it. Tera bytes data you download might not contain the data you need. [Sheng] Maybe the next step of the RG, is to agree on a few use cases and the existing datasets which we can assess. [Steven Wright]: I don't think at current stage, a single dataset would be really useful. We start figuring out what problems we're trying to solve. Use case is really critical. A lot of ML, is business logic, not traditional network sense. A life cycle view of the handling of dataset might be the most useful thing at current stage. [Brian Trammel] ...?? ***************************************************************** 6. Open discussion on the future direction of NMLRG, leading by Sheng Jiang [Vijay]: suggest to invite researchers collaborate with SP at every IRTF. [David Meyer]: we don't have "serquest?" data, why not? ML community has long tradition of openness. Both open data, open source, and generally open science. [Sheng]: I encourage that. Maybe in the future we could set up NMLRG Hackthon, like that. [Lee Howard]: If you get algorithm/hypotheis works great against the clean-up public data, then running on the "serquest?" dataset, you'll get much more robust results. [David Meyer]: I totally disagree that algorithms get more robust. Openness gets more robust. [Lee]: I work for a operator, I can let you test on the dataset, but I cannot release the data. [Sheng]: Both you and David have the point. But the way is first we may work on the public open data, [Lee]: exactly the suggestion, starting from public data, you get better and better algorithms, work with dirty and dirty data. [Sheng]: And the algorithm should be open source at the very beginning. In that case, you can take that in your network, as well as other providers. [Lee]: I have no objection on open source algorithms in code, but that's not a requirement from me as an operator. Back to the co-located comments, operators know networks better, it will be always interesting IRTF co-locate with Operator groups. [Steven Wright]: about the scope, where do you draw the dividing line of network ML and ML in network context? [Brian Carpenter]: that's a very ambitious program. Be cautious about launching too many things in parallel. [Lars Eggert]: I won't decide this RG right after this meeting. Probably after Seoul meeting, before I step down, which is March next year. I've seen lots of interested potential here; I also think there is still a lot of confusion here. I like the original slide of what the RG will do. The longer term future works makes me confused. It seems we can do anything. Meeting adjourned, see you in Seoul.