# IRTF Open Meeting Monday, 27 March 2023, at 13:00-15:00 Japan Standard Time Room: G401-G402 Chair: Colin Perkins Minutes: Colin Perkins Recording: [YouTube](https://www.youtube.com/watch?v=qHT-jlc1rTI) ## Introduction and Status Update Speaker: Colin Perkins, IRTF Chair [Slides](https://datatracker.ietf.org/meeting/116/materials/slides-116-irtfopen-agenda-01) ## Introducing the Usable Formal Methods Proposed Research Group Speaker: Jonathan Hoyland [Slides](https://datatracker.ietf.org/meeting/116/materials/slides-116-irtfopen-introducing-the-usable-formal-methods-research-group-00) Jonathan Hoyland gave an introduction to the new Usable Formal Methods Proposed Research Group. This group is studying the use of mathematical techniques to assist in the specification, design, analysis, and implementation of network protocols. That is, it’s asking if we can bring mathematical rigour to our work with protocols? Jonathan gave a very brief history of the use of formal methods for protocols. He summarised the difference between formal analysis and formal verification of protocols, with the former broadly focussing on design correctness and the latter on implementation correctness, and noted that the research group will consider both aspects of formal methods. He reviewed the way the academic formal methods research community engaged with the design of TLS 1.3 and found, and removed, several flaws in early versions of its design. This was effective, but the correctness proofs are hard to produce, hard to understand, hard to verify, and hard to adjust. For academics formal analysis of protocols is also high risk and low reward: results that find new types of problem are publishable, but those that demonstrate correctness of a protocol are hard to publish. The UFMRG is aiming to provide a place for experts to gather. It’s aiming accumulate a pool of knowledge and training materials around formal methods applied to protocols, and to provide feedback to designers of formal specification tools. It also hopes to provide a venue to publish proofs of correctness, possibly via the Applied Networking Research Workshop. A non goal is to change the IETF processes. This is a research group that hopes to encourage the use of formal methods, not an attempt to force their use. Success will come if people see the work in the group and want to use formal methods in their protocol design. Questions: * Rod van Meter, Keio University: proofs depend on a certain set of assumptions, is considering these in scope? Yes, there are some standard assumptions used, for example that a cryptographic algorithm is secure, that need to be verified separately. * ???, Telefonica: is the focus solely on security properties? No – that’s Jonathan’s background, but it’s not the sole focus of the RG. ## Introducing the Research and Analysis of Standard-Setting Processes Proposed Research Group Speaker: Ignacio Castro [Slides](https://datatracker.ietf.org/meeting/116/materials/slides-116-irtfopen-introducing-the-research-and-analysis-of-standard-setting-processes-proposed-research-group-00) Ignacio Castro introduced the new Research and Analysis of Standard-Setting Processes Proposed Research Group (RASPRG). This group is about understanding how the standards process works in the IETF, with the goal of producing data and understanding to inform the IETF and the broader research community. For example, the group might analyse the email archives to help understand the consensus formation process, it might develop tools to make recommendations for draft reviewers, or it might study demographic shifts in the community that develop standards, etc. The group will take a collaborative approach, between practitioners, developers, policy maters, and researchers, and will provide a repository of data and tooling. It’s not aiming to make hierarchical comparisons about different standards developing organisations or to change operational practise in the IETF. ## Autonomous NIC Offloads Speaker: Boris Pismenny [Paper](http://www.cs.tau.ac.il/~mad/publications/asplos2021-offload.pdf) [Slides](https://datatracker.ietf.org/meeting/116/materials/slides-116-irtfopen-autonomous-nic-offloads-00) Discussion: * Colin Perkins: The approach assumes certain properties of a protocol to be offloaded: it must be size preserving, incrementally computable, have a constant size state, and so on. These obviously fit some protocols, but not all. To what extent are these fundamental constraints vs. limitations of the current work that might be removed in future? * Boris: They’re almost all fundamental. The only one that could be softened is the one around block sizes for AES CBC, but since AES CBC is deprecated that’s not a priority. * Dave Oran: Do the trade-offs change if you have a user-mode TCP stack? No. * Colin Perkins: Is there any guidance or issues to consider that the IETF should consider when designing protocols to make this type of approach to offloading work better? * Boris: This approach also works for TLS 1.3, but one of the issues there was that’s it’s not possible to know if a record is application data or something else until the entire record has been decrypted. This introduces complexity in dealing with application data vs handshake packets that could be avoided if different types of data could be distinguished after partial decryption. * Rod van Meter: To what extent is the value of this work dependent on the particular state of the technology? Will the same ideas apply in future as hardware advances or do the depend on particulars of current systems? If someone offers one parameter you could change in the system, e.g., double bandwidth or more cores, what would improve performance? * Boris: We’re in the early days of this technology, so hard to predict. Making the most out of this requires a lot of software changes. The trend is that NICs are getting faster but CPUs are not, which is pushing for offloading. As long as we’re using TCP, expects the technology to remain relevant. ## AI/ML for Network Security: The Emperor has no Clothes Speaker: Arthur Selle Jacobs [Paper](http://www.inf.ufrgs.br/~granville/wp-content/papercite-data/pdf/ccsjacobs2022.pdf) [Slides](https://datatracker.ietf.org/meeting/116/materials/slides-116-irtfopen-aiml-for-network-security-the-emperor-has-no-clothes-00) Discussion: * Stephen Farrell: This depends on have access to the training data. Can it be extended to not need such access? Yes - you don’t need the exact training data, although the insights are better if you have the data and you can develop explanations. * Stephen Farrell: Are there any recommendations that can be made to validate models? This is hard to automate unless you’re familiar with the domain, but the paper does have some guidelines. * Doug Montgomery: Is the resulting decision tree model unique or optimal in some way? It’s not unique and probably not optimal. It optimises for fidelity, and there could be multiple decision trees that have the same fidelity. The expressivity of the decision trees is lower than that of a neural network, and the different decision tree explanation for the same neural network might all be true and reveal different aspects of the system. * Doug Montgomery: Are you retraining on the data? No, they’re evaluating on the original data to see how it will perform. The techniques are sold as black-box models so you don’t need to retrain the models on the synthetic data. * Brian Trammell: The analysis is based on fairly deep domain analysis. Is it accurate to characterise the work as automation assisted analysis of decision trees? Yes. * Brian Trammell: Can you generate or extract synthetic network data using the model to automate the analysis? Yes, there was a HotNets 2021 paper doing something similar. * Priyanka Sinha: The start of the paper is about explaining the models, but the later aspects are commenting on the datasets. She’s familiar with the “data excellent work”, data sheet for data sets, and wonders if this has relevance? Most of the problems have arisen from bad data, but choosing what features to use from the data is an important part of the model; it’s an important topic. Also, have you considered an information gain measure? No, but that’s a good point. * Diego Lopez: We have been trying to setup mechanisms for publishing usable data sets for training ML model. Wondering whether it would be advisable for us to try to make available public, high quality, data available to train the models? Yes, there’s a lot of private data in networking and less sharing than in other domains. Making more data available would be valuable. Privacy issues are a concern, and they were not able to release all their data as a result.