# IETF 110 Computing-in-the-Network (COIN) RG COIN RG: https://datatracker.ietf.org/rg/coinrg/about/ COIN RG documents: https://datatracker.ietf.org/rg/coinrg/documents/ Meeting materials: https://datatracker.ietf.org/rg/coinrg/meetings/ These notes: https://codimd.ietf.org/notes-ietf-110-coinrg Notetaker: Eve Schooler Chairs update, reminder of IPR policies, today's agenda is research focused. # 1. Research Presentations (15 mins each) ## a) FlowLens - Diogo Barradas (U. Lisbon) Joint work with Nuno Santos, Luís Rodrigues, Salvatore Signorello, Fernando M. V. Ramos and André Madeira Paper from NDSS'21: https://www.ndss-symposium.org/ndss-paper/flowlens-enabling-efficient-flow-classification-for-ml-based-network-security-applications/ Research questions: Can we collect packet distributions within programmable switches? And to do so eficiently and generically? Given that it does not seem feasible to obtain packet distributions in programmable switches at scale. Contributions: FLowLens a flow classification system for generic ML-based security tasks - Flow markers: compact representation of pkt distr in programmable switches - Flow marker accumulator: implementation of flow marker collection in switching hw - automatic profiling: app-tailored config of flow markers - Evaluation: tested in 3 diff ML-based security tasks: covert channel detection, website fingerprinting and botnet detection. What does it take to compress packet distributions efficiently? Produce flow markers with two operators: quantization and truncation. Up to 150x size reduction. How are flow markers collected in the switch? Feed forward pipeline. Truncation requires only an additional pipeline stage. Leverage Bayesian optimization to reduce the large configuration space of quantization x truncation. Saves many hours of testing sub-optimal configs. FlowLens scales the amount of inspected flows and retains accuracy. [See slides for full details on detection and accuracy measurements.] The experimentation artifacts are publicly available. [See the code in the slides, as well as very interesting Discussion questions] ## b) Forwarding and Routing with Packet Subscriptions - Theo Jepsen (Stanford) CoNEXT'20 paper: http://www.cs.yale.edu/homes/soule/pubs/conext2020-jepsen.pdf We now have fast, programmable networks. We can use them for more expressive routing, by using packet subscriptions. Challenges: How to make practical? How to eval rules? How to route rules? How to populate tables efficiently? How to make them compact? Compile subscription rules and convert into a BDD (binary decision diagram). Each node in the BDD tree is a predicate of the rule. Key contribution: How to execute the rules on a switch. The output of the rules is state; state is passed from table to table. The talk presents 2 schemes for routing these packet subscriptions. A Traffic reduction scheme and a Memory reduction scheme. See the paper for another scheme relying on Approximation, and the tradeoffs. Questions: - Are packet subscriptions useful to applications (and do they supply the interfaces they need)? Looked at apps: Nasdaq (market feed filtering), hICN (video streaming) and in-band network telemetry. - Do these subscriptions provide performance benefits? in-net filtering reduces tail latency. - Whether memory is used efficiently (FIBs)? Overall the compiler uses memory efficiently. No difference between the 2 schemes. Conclusion: Pkt subscriptions provide the net abstraction used by apps, improve perf by using net resources efficiently, scales to large net topologies. Try it out! [See slides for pointers] Q: Alexander Clemm. How do you deal with encrypted traffic? Can specify to the compiler the header format in P4. Some limitations i.e. how deep can look into header. Subscriptions currently in plain text. Looking at homomorphic encryption. Q: Dirk Kutscher. What about congestion control? Would argue it is orthogonal, but could add CC to packet subscription. Another related issue is reliability. Difficult to handle in subscription systems, which have 1-N, N-N models. So, CC should probably be handled at a different layer than packet subscriptions. Q: Tianji Jiang. For the example used with Nasdaq, see the following paper: Too Fast or Too Slow? Determining the Optimal Speed of Financial Markets Austin Gerig US Securities and Exchange Commission Co-author: Daniel Fricke [From the chat] Q: David Oran: Apologize since I didn't read the paper - but do you reference the old DIFANE work that trades off memory for forwarding tables against stretch? That way you don't need full rule sets in every switch? Theo: 1) I already addressed this question. 2) For fragmentation, packet subscriptions supports fragmentation at the application level. This requires the application placing the header fields in all fragments. Since packet subscriptions does not track IP frag sequence numbers, it can't reassemble a stream at that level. David Oran: so the application needs to be changed to use this, which I guess is fine. This could also help with the crypto question, as you might check the ICN work on name privacy through obfuscation which can hide the actual matching values without crypto, and get some privacy (but not defend against linkability). Theo Jepsen: thank you for bringing up DIFANE. I am aware of that work. We do something similar with our approximation routing scheme that we discuss at the end of section 4 in our paper: http://www.cs.yale.edu/homes/soule/pubs/conext2020-jepsen.pdf ## c) FogStore: Toward a Distributed Data Store for Fog Computing - Harshit Gupta (Georgia Tech) https://arxiv.org/pdf/1709.07558.pdf Data management over geo-distributed edge computing infrastructure. Vision: a continuum of resources, could be owned by multiple providers, dense geo-distribution, limited capacity at each site. Disruptive to data management because: situation-awareness apps gen large amnts of data, data exchange for coordination between connected entities presents challenges. Challenges of edge computing infrastructures: Highly heterogeneous, dense geo-distribution (as compared with DCs, where granularity at the level of racks is sufficient), no support to monitor Client mobility (workload dynamism). This talk focuses on 2 sytems, two classes of data management. 1) DataFog: geo-distributed spatio-temporal K-V store. Targets situation-awareness apps, requiring low-latency access to data. Extended Apache Cassandra and ran emulations for performance evaluation w/ and w/out extensions. Synchronous replication for consistency. Many latency and fault tolerance tradeoffs. E.g., because of the tradeoff between latency and fault tolerance, developed two types of replicas, those in-vicinty and strongly consistent, and those that are remote and eventually consistent. 2) ePulsar: Topic-based pub-sub system. Geo-distr pub-sub system, similar interface as Kafka and Pulsar. The source of latency in pub-sub systems is often a result of being topology unaware. Two main contributions: incorporated network coordinates protocol to estimate client-broker latencies, and incorporated the coordinates into the broker selection and adapts to client mobility. [See references in the slides.] Q: Jeffrey He: Which is the major latency? Sychronization or replication? It is tied to the selection of the replicas and where they are located. So it is actually a combination of both. Q: Dirk Kutscher: in general, on a good track and good potential re how the relevant app-layer system can be better supported by the network. In your system, we are often interested from the performance of the network topology and not so much the geo-location or geographic coordinates. What is your view? One reason why we moved toward a more net coordinate approach, agnostic of location, proximity in terms of RTT. As edge provider, wouldn't know about the innards of the network, e.g., a micro-edge deployed on peering sites. [From the chat] Q: Stuart Card: How about latitude, longitude and altitude? David Oran: Let's replace hop counts with cartesian distance :-)and stop loops by never getting closer to the source (c.f. https://en.wikipedia.org/wiki/Geographic_routing). Q: Jianfei(Jeffrey) HE: Geographical routing? Stuart Card: There is a spotty history of attempts at "geocast" Jianfei(Jeffrey) HE: routing by computing without tables David Oran: Geo is likely useful for filtering, but not for routing Dirk Kutscher: yes Stuart Card: Also routing if it is a homogeneous wireless mesh, otherwise not so much. Harshit Gupta (the presenter): We envision that for a heterogeneous edge, location is not the best method of selecting replicas. A more topology-agnostic approach like Network Coordinates can help. Network Coords would already take into account the routing between nodes. Do you consider edge-to-edge communication without going to the core? Jianfei(Jeffrey) HE: @Harshit, in Mobile Edge Computing (MEC) context, you can assume edge can communicate with edges without going up to the core. Philip Eardley: Interesting, thanks! the part with finding the marbles up the tree reminded me of some old manet routing work (mobile adhoc networks) and creating Directed acycle graph Harshit Gupta: @Jeffrey, is it possible to point me to some document that goes into little more detail about inter-edge communitcation? I have been having a hard time to get some meaningful network characteristics (topology, RTT) between edge sites. Jianfei(Jeffrey) HE: @Harshit, that is formally in the 3GPP documents. Tianji Jiang: @Harshit @ Jeffrey: it is 3gpp TR. But, the version in 3gpp right now is about to be updated in 1 week (https://www.3gpp.org/ftp/Specs/archive/23_series/23.748/) ## d) Hierarchical Data Storage and Processing on the Edge of the Network - Seyed Hossein Mortazavi (U. Toronto) with Eyal de Lara, University of Toronto https://usenix.org/system/files/conference/hotedge18/hotedge18-papers-mortazavi.pdf https://cse.buffalo.edu/faculty/tkosar/cse710_spring19/mortavazi-sec17.pdf Next gen apps req lower latencies, better response time, high bw, as well as privacy, exact loc detection and scalability. Has led to Edge computing, and the notion of a hierarchical DC infrastructure. To emulate success of web (CRUD) apps on the cloud, where apps are partitioned into independent stateless handlers; this model was enabled by a shared storage layer. CloudPath - a platform that enables the exec of 3rd party apps, advocating a separation between app code and data. Developers: organize apps as a collection of stateless fns. CloudPath: on-demand replication of code and data. Provides a common run time on all flavors of cloud nodes. Ran a face detection application in their experiments. Although tremendous gains, there are however issues with routing overhead due to its implimentation over HTTP. Discussion - Network requirements - Routing reqs for apps based on fns - Synchronizing clocks between DCs - Providing locking service for data consistency - better guarantees for data can be provided (if there is a service inside the network itself) [From chat] Jianfei(Jeffrey) HE: @Seyed, when you say for networks to provide "locker" service, can be a global sequencer insider the network equivelent to that locker manager? Global sequencer: receive all requests and assign a global unique sequence to each. Seyed Mortazavi: @jianfei yes, I think a lock manager can be appropriate, maybe different levels of lock managers to keep locality.I have to think about the global sequencer, but a lock manager I think is more appropriate compared to a full blown paxos solution for the edge.The key here is the local reads/write. If we don't have local read/writes then the edge becomes less useful. One may think why use the edge and have all the complexities when one can simply send to the cloud. ## e) The connected intelligent machines’ technology journey - Edgar Ramos (Ericsson) https://www.ericsson.com/4af428/assets/local/reports-papers/consumerlab/reports/2020/ericsson-10-hct-report-connected-intelligent-machines.pdf This presentation is more about research questions for the future, comparing what the network is today vs what it should look like in the future. Crowdsourced a set of questions to researchers at Ericsson and this presentation is the result. 10 exponential technological forces: Ubiquitous connectivity, IoT, AI, Cloud & edge, trustworthiness, robotics, new ways of interaction (think goggles), new computing paradigms, alt energy tech, smart materials Combinatorial effects of their interaction and intersection Re COIN: The challenge: will the future network match machine evolution? Integration of more autonomous, more generic devices. Yet ack the challenge to facilitate customization whilst maintaining interoperability. The network services needed: discovery, exposure, trust and self-organization. When everything is intelligent, there will be a need for mediation and arbitration. Question if the network will be relevant in this vision of the future? In a highly intelligent world, the net relevance will depend on the incentives provided, to avoid being by-passed. The discussion today: Connected intelligent machines (en route to 6G). What do we mean by intelligent machines? Encompasses multi-sensing deeply specialized machines, machines learning from other machines, regenerative self-managed machines, environment- & intent-driven machines, decentralized intelligence & collectively reasoning machines, machines moving twds general intelligence. What are the ranges of enhanced net platform features and services needed in this intelligent machine vision? If everything is connected, there are additional dimensions of the problem space to consider, such as: interoperability, human-machine co-existence, trustworthiness (where trustworthy means responsible, resilient, secure, safe, explainable, unbiased, fair and ethical). [See the slides for a detailed and very interesting visualization of the Connected intelligent machine relationships matrix.] A high-level AI-2-AI network stack is presented. [From the chat] Stuart Card: This is a remarkably broad vision, including what appears to be acceptance of "runaway AI". I agree, so am focused on "friendly AI" and trustworthiness. Edgar Ramos: @Stuart..and well AI is also biased as humans are in many aspects, and we can expect that there will be conflicts on how agents will act depending on who produce them even if not ill-intended (from for example cultural crash perspective). # 2. Drafts Updates ## a) draft-sarathchandra-coin-appcentres-04 - Dirk Trossen - 10 minutes “In-Network Computing for App-Centric Micro-Services” https://datatracker.ietf.org/doc/draft-sarathchandra-coin-appcentres/ Moved the use cases into use case draft. Just a stub reference now. The requirements have been linked more clearly to Section 5, which now has a mixture of refs to research and standards. Section 6 is entirely new. Requirements were put into technology areas (vs use cases previously). Also discussing how to get those requirements into the use case discussions. A mapping of requirements to standardization efforts, although extensive, it is not exhaustive of course. Future: Link to other drafts, updating additional related research, even more SDO efforts, esp identify what standards have NOT done so far. ## b) draft-hsingh-coinrg-reqs-p4comp-03 - ran out of time Hemant Singh - 10 minutes Requirements for P4 Program Splitting for Heterogeneous Network Node” https://datatracker.ietf.org/doc/draft-hsingh-coinrg-reqs-p4comp-03 ## c) draft-hsingh-coinrg-p4use-00 - ran out of time Hemant Singh - 5 minutes “Use of P4 Programs in IETF Specifications” https://datatracker.ietf.org/doc/draft-hsingh-coinrg-p4use-00 # 3. Discussion - 10 minutes Recurring feedback: Need to leave more time for discussions!!! # 4. Conclusions and Future Plans - Interim and IETF 111 - targeted for May. - Action item: I-D authors of expired drafts, please let co-chairs know whether you intend to progress the drafts - or not. - Goal for the next Interim: Update Milestones.