# T2TRG IETF 101 Summary meeting Thursday March 22, 2018, 15:50..17:50 GMT (UTC+00:00) Note takers: Christian Amsüss & chairs ## Intro, RG Status Slides: https://datatracker.ietf.org/meeting/101/materials/slides-101-t2trg-chair-slides Carsten Bormann: note well, announcing github as per slides, presenting the agenda What is T2TRG / scope? We care about true IoT, where (also constrained) things communicate among themselves and with the wider Internet. Looking at issues in the area that provide opportunity for standardization within IETF or other SDOs. Radio is not a topic here (but may need consideration), but the range from IP adaptation to the end user, obviously including security. Current focus is on semantic and hypermedia interoperability ("WISHI"). Current drafts: IoT security considerations. Side meeting on coexistence happened during this IETF. Had online session with OCF on security and ACE. Recently had NDSS security workshop on decentralized IoT security and standards ("DISS") -- combining the "you have to have standards for interoperability" and "IoT is often not centralized". 12 papers in publication from reviewing IETF work to speculative ideas. Next steps: * Prague starting tomorrow (T2TRG, OCF (former OIC, AllSeen and UPnP), W3C WoT) * regular WISHI calls * possibly OCF plugtests * IETF102 Montreal Current documents: "security considerations" in IRSG review. Padhu: for joint meetings, could be interesting to have joint call with OMA. In July testfest coming. Would be interesting to showcase some of that and testfest. CB: good idea -- when? Padhu: week before IETF Mo-Thu. Breaking on Friday so that can travel to IETF. CB: one of the CoAP plugfests earlier was with OMA. Makes sense to deepen the collaboration. (back to documents) reviews were useful. "RESTful Design for IoT": hypermedia guidance in review, PATCH/FETCH guidance still needed. (btw, everything is working with git, so we are processing pull requests). ### network coexistence topic Laura (reporting for network co-existence side meeting): administratively independent IoT applications can cause packet loss due to radio interference (not directly our topic), but timing behavior makes those problems worse, and we should consider that. Right now, we're lacking the tools to evaluate such situations (multi-channel logging), and it affects several layers. MAC must do most of the work (-> IEEE topic), but it only has limited capability, and there is a draft [https://tools.ietf.org/html/draft-feeney-t2trg-inter-network] describing the topic and what we can do. Gabriel M: elaborate on the research; in Europe required to listen before talk. Is this research, does it assume LBT is in place or still have these issues? CB: We try to point out the problems right now. The conclusion so far is that we need more research. We're building everything we do assuming we're alone, but should consider that the highway is not as empty as in our testbeds. In practice hundreds of SSIDs heard. Some things hard to test in todays' testbeds. We want to look at the situation and start reasearch based on the observation that independent networks meet in the wild. Laura's research shows that there are surprises. We need to find out what other surprises are there. Gabriel: There are some rules in place depending on the regulation; we'll need to keep that in consideration. Laura: more details in the draft Juan-Carlos: responding to Gabriel, all those points were part of the side meeting discussion. Not all protocols follow same MAC. Some IEEE, some something else. There's still some potential for sharing knowledge on higher layers. Could get knowledge that is useful through the Internet. Some of the networks are public, some are private. There might be things to do without endangering privacy. CB: Let's not design the solutions today. Eliot Lear: Where and how continue conversation? CB: On RG mailing list. EL: In draft, is there data collection methodology recommended to characterize the problem? Laura: document focuses on laying out the challenges. I recommend that T2TRG, or with some other mechanism, would like to have ways to evaluate how 6lo, roll, etc. perform in these environments. EL: Even before we make recommendations, we need to get data collection methods. Laura: I'm not even sure we are there yet, we need to make up our mind about what we want to know as an RG. Alexander Pelov: In LPWAN related work; interesting topic. Laura: This touches a lot of groups, T2TRG and others -- but not now, we're over time for this. ## Report from WISHI and Hackathon Michael Koster presenting: https://datatracker.ietf.org/meeting/101/materials/slides-101-t2trg-report-from-wishi-and-hackathon WISHI := Workshop on IoT Semantic/Hypermedia Interoperability Essentially, looked at semantics and protocol neutral semantic annotation, how does this work with ontologies and third party vocabularies that might be hard to use? Where does the semantic metadata go? Is this a layered stack with different metadata at different protocol levels? Several SDOs are interested. Also, data types and engineering units are a topic. Goal for this hackathon was bringing diverse things together and explore semantics based discovery. HTTP, CoAP, and MQTT were involved; starting from simple input and basic orchestration (motion sensor → light) Implementations involved: [see slides] Technology for connecting them came from W3C WoT group that has similar goals. As results, Thing Descriptions were generated from LWM2M instances, CoMI was converted to Thing Descriptions (TDs), TDs stored in a Thing Directory and used from there. ### Thing Descriptions Thing Description is an RDF mediatype describing interactions supported by things, and binds them to instances of things and transfer layer instructions. Applications used abstract interactions, rendered on different protocols. Layered scope of TDs: "what we want to do" (from the information models) meet protocol bindings (including particular CoAP-based protocols like OMA) at the TD. While they all use CoAP, they use it in quite different ways. Thing Directories can register there and applications discover them from it. Uses same protocol as Resource Directory. We used one TD as a well-known entry point to the system, and the applications learned what they need to know from it. [schematic of interoperability] First Registration, then Discovery, then direct or intermediary-mediated (e.g., pub/sub broker) use which may interact with different components. Thing description can be reachable both locally and from the Internet. Example with YANG implementation: YANG description converted to TD and fed into directory, which is used by an HTTP device via a servient. Hannes Tschofenig: What are the entities here? What's the green box? MK: Can be the device or an agent on behalf of the device. HT: What of the [w/o green box and w/ green box] diagram corresponds to what? MK: The servient is an application proxy. The servient interprets the operations as TD operations, and passes them on. HT: The diagrams are not aligned, not all boxes are everywhere. Good to get slides in sync to make it understandable. Ari: Those partially show hackathon setup that might not always map fully to the draft diagram. MK: The device is not registering itself. It only registers itself to the LWM2M server, and the adapter creates a semantic annotation based on its knowledge of the LWM2M numeric codes. The HTTP device (here: RasPi) can register directly, but that's not necessary. Next steps: * annotate RD and link-format (what do we need to annotate?) * more backends, e.g. BACNET * more automation, e.g. SPARQL to URI-query HT: Was the mapping possible w/o losing data? We're piling things up, and things are getting difficult when something is lost. MK: In lightbulb case, there is no loss, and it's extensible, so more details from the backend can be added. No need to describe protocol details. HT: One thing we learned from IOTSI WS is that you need to look to the data *and* the interaction model. Would be interesting to look at a more complicated example to see whether it keeps working. Matthias Kovatsch: In W3C WoT, we look at interaction model, what are the abstract operations -- there we have properties, actions, and events (read, write, call actions that may need time, or be RPC based, and async sending). There's the core model in the thing description. Here common protocol was CoAP. Interactions noted down in the Thing Description. HT: I'm interested in: did the things you tried to accomplish work out? Where were the problems? Matthias: W3C document documents that. HT: That won't contain hackathon experience. Matthias: Report is not ready yet. Another plugfest is on next weekend w/ different organizations, so there will be more reports. We're currently collecting data about what semantic interactions we have. iot.schema.org is also there and shows the easy ones. Ari from floor: LWM2M experience: it was very simple what we did so far, get information of registered clients from the server and map that to TD; we'll learn more but right now things mapped well to thing description; next step is semantic mapping and the question of how close that can get. Elliot Lear: Before you get to more complex devices, it would be useful to have more complex model of existing devices, e.g. Hannes' work on ACE. The moment we add an authorization model, even the simplest devices get complex, and that's a great area for discovery. I'm not going to do that, but it's hard. Michael Koster: good point. We're also looking at the security models, it's a next step but not far from now. How do you generate the right subject and tokens? It's for next phase. Al-Naday Mays: How big is that thing directory? Which scale? MK: Thing directory can be on router or LAN. Al-Naday Mays: Not interacting remotely with other TDs? MK: Haven't worked on federation of TDs. One of the next models is automotive model. There, door switch must be bound to a door, we don't have infrastructure for that yet. But now out of time. CB: this is ongoing activity and we will report more of the findings later on. ## Deep learning on microcontrollers Jan Jongboom presenting: https://datatracker.ietf.org/meeting/101/materials/slides-101-t2trg-deep-learning-on-microcontrollers This is about a research project at ARM. My first experience in the field when teaching summer school in Tanzania; "how would we apply machine learning to our fields" and I was the IoT guy. Questions of "where to store data" etc. discussed. Actual field tests in applications. Will happen again this year with students. Machine learning is often understood as big datacenter stuff. Taking a step back: algorithms can run on edge nodes. "Sensor fusion": combine cheap sensors, gather data; combined data have worth, creating supersensor. For example, a small group of sensors can be trained to observe things in a house. Google is doing federated learning on Android keyboards: We can't send raw training data to cloud for privacy aspects and because of the size of the data. There is a local model continuously refined, and those model changes are sent to cloud. That is then processed on cluster. Another example: For file formats, there could be really good compression. The model for feature reduction is not immediately obvious. A deep learning model can create an encoding scheme for very localized sets of data. In sigfox and LoRA compressed versions are critical for bandwith reasons. Off-line self-contained systems (again Tanzania): ML model on farm there and w/o Internet can't use expensive computers. We wanted to detect whether cow is in heat based on data from skin. Edge vs. Cloud: edge can be faster (fewer roundtrips) and more efficient (in terms of energy). At least for some use cases. "Edge" will be microcontroller -- small, cheap, efficient; slow, limited memory. "uTensor": built on Mbed. Runs deep learning (no training) on <256k RAM. It does classification, but no learning; model learned in the cloud and pushed to device. Object classification in videos works as well. This enables interesting use cases w/o upstreaming or using expensive machines. Project originated from ARM, but not run by it. There is a simulator for it. Looking for people to experiment with it and contribute. Demo: draw on touch screen and have it tell what it was. Running on microcontroller with less than 256k of RAM. MNIST data set of handwriting is base for that. Supervised learning w/ back propagation, trained on tensor flow. Classification on µc: touch screen input is trimmed, downscaled (28x28px), and fed to 28x28=764 neuron input, and goes through hidden layer to output layer and loss layer that picks a digit. Hidden layers have matrix multiplication, bias and activation function. This matters b/c it looks too hard for a microcontroller. What helps is quantification to 8bit (because memory explosion is the main issue). During training, 32bit floats are relevant, but in classification, 8bit integers give only moderate losses in accuracy. Dequantization requires back-convertation into f32, but worked around that. Memory is primarily used for neuron data, but still leaves some room on 256k. Memory paging is an option if µc has <128k RAM. Layer description is precompiled and used from ROM (26k). Sparsity can still be utilized (save memory, lose some more accuracy). Operators (see slides) need to reflect tensorflow operations on microcontroller. With some upcoming operations, more classifications can happen. Tensors can be RAM, flash, sparse or networked (that might interest IETF: meshed devices all look at a subset of the data; it's a floating idea). We can split data sets and do classification; would be cool to see 100 devices together do classification. Workflow: set of data into tensor flow, microtensor CLI creates tables from trained model and generates c++ and hpp file, and that can be executed on µc. slide: graph compared to actual C code generated for that. Mbed simulator runs in the browser. Cross-compiles into JS and runs it there. New: kernel extensions for Cortex-M in CMSIS-NN -- those allow using higher Cortex level (M7 rather than M3) operations for speed-up. Object classification in image becomes viable, will be integrated into project. Demo of live object recognizition on Cortex M7 microcontroller; detecting cats and frogs from webcam live feed. Speed: on 216MHz and 133kB RAM device, the RAM is sufficient and computation speed is the limit. 3 Convolution layers. From 500ms to 100ms (10fps) when CMSIS-NN was introduced. Recap: try it. Edgar Ramos: comment: problem I see with such an architecture is same as w/ mobile phones and "many microcontrollers that do their own things": How can you address all of them? If you want to run one algorithm on all devices, how can you port that to them? That we'll need to overcome. Slides are on-line ## Semantic Interoperability Testing - Current Trends and Future Outlook Remote presentation from Soumya Kanti Datta: https://datatracker.ietf.org/meeting/101/materials/slides-101-t2trg-semantic-interoperability-testing (from Eurecom which has many industrial partners) Small industrial extension of a Horizon2020 interop. Identify gaps in semantic interoperability testing. Interoperability is key to full potential of IoT market. Strong need for interoperability at data level. Recently, many have identified *semantic* interoperability to address that. In industry, people understand benefits of semantics but have no way to test and quantify them. So here, we identified the testing of the interoperability as a gap in the solutions. How can we provide tools, guidelines for semantic interop? Propose conformance and interoperability testing. Conformance is tested against reference ontology. Interoperability has two systems under test that exchange data. Requirements for conformance tests gathered [see slides]. Conformance test has test scenarios between SUT and Tester, and tester does validation report based on request by the SUT. Basic example scenario: two SUT, with objective of executing an operation and verifying the result data. Both devices execute the same operation and compare equivalence of results. Discussed OneM2M example of testing. Introducing additional components between the SUTs: a query server can request operation results from two SUTs and compare the results. At data level: a tester receives data from two SUTs, and compares them for equivalence unter a data model. More complicated scenarios possible; main motivation for scenario on slides p14 is extending F-Interop to semantic tests. Comparison server could be an existing product. Please fill the survey and provide feedback. ## Secure Computations in Decentralized Environments Michał Król presenting: http://mharnen.gitlab.io/t2trg18/ PDF: https://datatracker.ietf.org/meeting/101/materials/slides-101-t2trg-secure-computations-in-decentralized-environments-00 About outsourcing computations. Requester is willing to pay for computation, and has bidding nodes that would do the computation; it verifies the computation and pays up. Right now done in cloud. Running at edge has benefits of privacy, so could be done by a mesh of nodes and run locally. But that changes trust management: In cloud setup, we trust Google, but in mesh we have to have built-in trust model. It's an open system, there is no vetting system. Judges in case of conflict. Needs rewards (otherwise see torrents). Actual validation needs to happen. Atomic payment and result transfer is required. Verification: depends on whether recomputation is required or it's cheap. Zero-knowledge not available for everything. Alternative: let it be done by many (but inefficient and has danger of colluding nodes). Results should be private from platform. Possible approach: trusted execution environments. Key technologies: * Intel SGX: a trusted execution environment (invisible from hypervisor, has remote attestation, direct communication channel). * Blockchain, smart contracts -- but be careful b/c public, and long delay. Payment channels run transactions off-chain (only deposit on blockchain). Oracles used as trusted data feeds. Assumptions: * distrust in each other * trust in function * trust in blockchain Example: AirTNT: Execution Platform runs enclave, channel to enclave opened, the enclave is tested. Input data sent to enclave. Result generated. Enclave encrypts results and sends it back (Requester now has encrypted result and a hash of the encryption key). Both are attested by enclave. Execution node sends payment transaction to blockchain unlocked by publishing the secret on the blockchain. * Downsides: slow, allows DoS * mitigating DoS: use payment channel to split task into many small tasks chunk by chunk. (That worse b/c there's no transaction cost for small payments any more). * if communication fails, it's hard to tell what went wrong. Alternative to blockchain: Result is put on IPFS, and smart contract uses an oracle into IPFS. Conclusion: system built, fully automated. Needs Intel hardware. Limited to 100MB. Hannes T: Comparison to other platforms? Michal: not yet; not aware of any alternative that can validate the results HT: Well they have different characteristics. Michal: It's only a start, and to be published. HT: This is one solution with its characteristics, but there are others, would like to see comparisons. Michal: Didn't find something with that characteristics. HT: Yes, but other interesting ones; would be interested in reading about that. Open questions: task dispatch is right now up to the nodes, but how would this be automated? How to deal w/ different prices, loads and capacities? How to estimate cost of computation in heterogenous environment? How to protect who executed which function? CB: interesting for people who thought TLS was complex. We are going to see more something like this in decentralized environments. Interesting example of consequences of decentralization. ## Meeting Planning, Wrapup CB presenting meeting tomorrow in Prague Dave Thaler about agenda: we should carpool to flight. OCF, T2TRG and W3C WoT will meet. Status update between each other. Determine unstandardized issues: what data do you need to send for an action? Talk about model interoperability tested in the hackathon. (IPSO/LWM2M is simple to translate, TD is powerful, OCF has RAML/swagger-based descriptions -- put them together). How to model pushing data? Pub/sub is popular but has problems on its own. Binding; what representations are involved? Housekeeping on CoAP. Talk about ACE (security framework) and interaction w/ OCF model and the authorization formats in use there (cf Fairhair whitepaper); multipoint security is an important topic here. Reference implementations and test cases.