# IETF 111 Computing-in-the-Network (COIN) RG COIN RG: https://datatracker.ietf.org/rg/coinrg/about/ COIN RG documents: https://datatracker.ietf.org/rg/coinrg/documents/ Meeting materials: https://datatracker.ietf.org/rg/coinrg/meetings/ These notes: https://codimd.ietf.org/notes-ietf-111-coinrg Notetaker: Jianfei He Chairs update, reminder of IPR policies, today's agenda is use-case focused. documents status check: remind the authors to manage their documents: update or abandon. # 1. COIN related activities (20 mins each) ## a) Piccolo -- In-network compute for 5G services Use cases https://piccolo-project.org/ ### 1) Risk Monitoring - Philip Eardley The objective is to assess the risk inside the car and outside (road condition, other cars etc.) and raise signals for example to the drivers. Piccolo architecture: add piccolo nodes inside the network, dynamically allocate resources among edge nodes(e.g., cars), piccolo nodes inside the network and analytics engine inside the data centres. The main advantage is the flexibility: to deploy computations according to the resources at nodes, to deliver different algorithms according to the scenarios such as public bus or private cars etc. Proof-of-Concept is developed and now in lab testing stage. Topics under development & Research questions: - Re-engineer the in-network logic to be able to handle multiple edge nodes - Orchestration/control function e.g. to deliver algorithms to the network nodes - Capabilities of in-network nodes: just ML/DNN inferencing or arbitrary logic? - Flex resources up or down to handle concurrent demand - Ensure the in-network execution is secure & private, as the vehicle moves Q: Kireeti Kompella. How to balance the functions in the edge nodes and piccolo nodes. The functions related the security of the car should be preferably in the edge nodes. And if the network conditions are not detected very well, are more functions expected inside the edge nodes? Are these considerations in the design? A: Yes, make sure the functions in the network not safety-critical, or if they are, make sure to provide fall-back quick enough. This is a use case to explore: general distributed orchestration and what the issues are. [From chat] Comment: Adrian-Cristian Nicolaescu: I am pretty interested in Piccolo but I couldn't find a way into the project and to discuss and develop ideas with anyone within the team yet. Maybe we can discuss further at some point, if you wish.There are several ideas I have for how my work could fit in with this: "data augmentation/privacy filtering" part is especially interesting. A: Dirk Kutscher: yes, please feel free to contact us. In the automotive scenario, the potential inclusion of edge networks relates more to upstream data transmission/processing. ### 2) IIoT Smart factory - Peer Stritzinger This use case is about a real Bosch factory manufacturing the power brakes for cars. Normal factories program the machines manually. The movement of all the pallets and processes are handcrafted in PLC programs. This is not flexible nor efficient. The goal is "plug and produce": stick together conveyor belt system, self-learn the topology and routes the pallets in an optimized material flow. This should be built into the conveyor belt system and not use an external server. Embedded computing nodes are connected to form a typical network topology, and message passing processes (using the Erlang system). The question is how to map these complicated processes to nodes in a complex topology. Two modes of mapping: static one to map the process to nodes, mainly local, control the neighbouring elements; and very dynamic one to make use of all computing to run more complex, distributed planning algorithms. Research questions - Can a distributed orchestrator map the computation in these cases? - Can we successfully run a distributed online planning algorithm on a mesh of IoT systems only? - What could a generic extensible solution look like? - Possible extension: Ethernet TSN path control and reservation Q: Eve Schooler. What kinds of hard real-time constaints are part of the system? A: The current system only controls the conveyor belts, time constraints are tens ms or 5ms and software control are sufficient, because nothing bad happens if you are slow. If you want to do combined access, precisely control the pilot, sync the robot arms, you need to go to sub-ms. That'll be the extension of this use case. ## b) Preliminary Summary of Dagstuhl CFN seminar - a teaser - Dirk Kutscher https://www.dagstuhl.de/en/program/calendar/semhp/?semnr=21243 Objectives for this Seminar: Definition and Research Agenda for "Compute-First Networking". Is there a space for a new approach of integrating computing and networking? Beyond packet flow processing and Beyond microservice overlays? Learning from distributed computing systems (and taxonomy of those); What are today's hard problems? What are promising new approaches? Is there a critical mass of research? What are the research challenges? As a motivation: many successful distributed computing systems are overlays today, can we do a better job by leveraging the network to optimize the performance or make it easier to compose these systems. The agenda includes use cases, challenges, research ideas such as PhD topics etc. Use cases: health sensing and analytics in a federated method without uploading to a centralized place, for privacy preservation. E.g, Federated PCA, split learning. Facts: server vs switch; constraints in P4 and programmable data plane, difficult for a general computing platform. Challenges: how to handle heterogeneous computing platforms Clean slate way to view "orchestration" and leverage tools like in-network telemetry to help the nodes to make smart decisions by themselves. Rethinking abstractions of computation and communication, e.g. a broadcast-only network, distributed memory side effects. Provisional Summary. Trends in hardware development: • limitations of multi-core • evolution of specialized hardware support • programmable data plane • Trends in distributed application design • Distributed ML (federated ML, split learning) • Applications increasingly distributed multi-party and distributed Opportunities: • Programming abstractions and platforms that do not treat network as black box • Leveraging different types of hardware platforms optimally • Exploring optimization potential: joint optimization, less centralized designs • Rethinking role of management and orchestration # 2. Theme: COIN Use Cases (20 mins each) ## a) Compute-First Networking Use Cases - Dirk Trossen Based on the collaboration project between BT, Huawei and Cambridge University. Taxonomy for Use Case Work: Description; Services; Drivers; Economic Value; Time to Demand; Stakeholders. Use case 1: Distributed data storage, also meta use case (used for other use cases). Services use distributed consensus system (DCS) and DLT with discovery, transaction and use some hashing methods or work patterns to find the right miners Use case 2: transportation/vehicle to everything (V2X). Service requirements: • Provide reaction capacity to (virtualized services with fast re-routing) • Reduce needed capacity (filter, pre-processing) • Data privacy and access control (link to DDS use case) • Always connected to best service (compute-aware traffic steering) Use case 3: Digital twins. Service requirements: • Data retrieval, fusion and storage (e.g., using DLT solutions) • Distributed AI computations for models, using gRPC or RDMA invocation models • Communication patterns may be 1:1, 1:n, M:N with short-lived groups, often needing to optimize communication and computation pipeline (e.g., live feed A/V with feature extraction in central unit) Requirements Derived from Use Cases: • Identification (linking to processes, obfuscation of purpose) • Announcement of computation (within and across domains, delegated announcements, pre-announcement) • Interconnection across limited domains [RFC8799] • Allowing to bind to available computational instances under dynamic constraints • Collective communication patterns (request-specific) • IPv6 support Next steps: pick a use case of choice, system architecture this year, demonstration next year Q: Eve Schooler. The consortiums that you are listing, wonder which of these we should hear from, does CFN project interact with them? A: We're involved in some initiatives. GAIA-X: Huawei is active. In Industrial IOT consortium, there is an industrial ledger working group, we presented the impact of DLTs on the network. Q: From chat window, Adrian-Cristian Nicolaescu. Are you targeting filecoin? A: Filecoin addresses some of the problems of systems like Ethereum through election, which leads to some sort of centralization. To me, that's one of the tussles in that the problems of DLTs (over IP networks) may counter the very nature of a DLT, i.e., its distributed one. ## b) Use Case for P4 Programmability by Tenants of Future Mobile Virtual Networks - Xavier de Foy https://datatracker.ietf.org/doc/draft-defoy-coinrg-p4-by-tenants-in-mobile-nw/ This use case uses 3 well known technologies: p4 for data plane programming, 5G as the underlying network, 5GLAN as virtualization technology. Rationale for P4 Programming for Mobile VNs: Handle complexity and Enable interchangeable or portable programs over various underlying networks. High level description: The mobile network appears as a logical switch to the tenant; the tenant deploys P4 programs in the logical switch; The tenant can also operate a controller, which communicates with the switches using the P4 runtime API. Requirements / Opportunities & Research Questions: - Splitting/Distribution: Data paths are distributed, with no central node in the general case; need to study distribution of P4 programs - Multi Tenancy Support: Multiple 5GLANs can share the same infrastructure; MTPSA and other studies on virtual network data plane programming provide useful solutions in this space - Mobile Network Awareness: A P4 program could interact with the mobile network system - Mobility Support: P4 programs should follow the data flow when mobile devices move to other attachment points - Security: risks include overusing network resources, injecting traffic, unauthorized access of traffic Next steps: research on new programmable dataplane, other underlay network such as data center. Q: Dave Oran (from chat). Virtual routing for 20 years, but isolation is still not solved. Is P4 making this easier or harder? A: need to integrate the P4 and mobile network to be controlled by the operators. Don't see big blocking issues. ## c) Store Edge Networked Data(SEND): A Data and Performance Driven Edge Storage Framework - Chris Nicolaescu https://www.researchgate.net/publication/346643946_Store_Edge_Networked_Data_SEND_A_Data_and_Performance_Driven_Edge_Storage_Framework/stats The edge data repository is partially inspired by reverse CDN. Whenever there is a request from edges, it always stores the data it needs, whatever future requests come, it may be able to satisfy the requests with the store. Implementation: Goal is increasing the data delivery efficiency of edge produced data and introduce 2 types of timing: freshness period and shelf-life (maximum storage time at the edge). This should be based on management decisions, strategy chosen EDRs relocate or replicate data. Architecture and Assumptions: Data used for processing within freshness period and useful at the edge for shelf life. Statistics on stored, processed and served data sent periodically to management module. Labels: Offer the ability to track performance, the popularity of data, and data placement statistics. The goal is to make the system aware of data context and improve its performance. Enable accurate data placement and storage decisions. Decide to have different data categories for different freshness period and shelf life. Apply different strategies. Prototype evaluation using Google File System. Good achievements on data insertion times, data lookup times, and on time completion. Future work: network and computing efficiency; information retention, reliability and security in Edge Data Repositories; potential development into more architectural and intelligent systems. Q: Eve Schooler. Who or what creates the labels to process better as part of the functions in the network? A: the applications and labels are configurable by users or operators. Q: Eve Schooler. Different categories of data, function-based, cloud-oriented, etc. How can applications tell the system in what direction the data is bound? What is its intended migration path? A: Applications are generating data, need a certain data to be queried at certain time. It's not about where to bound, it can be stored in any environment. Q: Marie-José Montpetit. What are the assumptions about the underlying network and storage? What type of operational systems? A: We do simulations. Different request per sec (1k~100k) generated, depending on datasets. 16 data repositories distributed. # 3. Lightning presentations of current drafts ## a) Use case draft update - Ike Kunze https://datatracker.ietf.org/doc/draft-irtf-coinrg-use-cases/ Merge contents from Dirk Trossen, Marie-Jose and other' drafts. Currently only use case descriptions. What we'd like to do eventually is to start analysing use cases, in order to derive a research agenda. Group different use cases in a more sensible way. Look forward to new use cases. # 4. RG topics Announcements (related workshops, conferences etc.): CoNEXT workshop: “The Computer in the Network” has been approved, will happen online December 7th, 2021. A spin-off from this research group. Next meeting: consider Interim “virtually co-located” with CoNEXT workshop in early Dec.