# MIMI Prague {#mimi-prague} ## Richard Barnes - Archicture Overview {#richard-barnes---archicture-overview} **Explanation of CSSC** * Each room in hosted at hub server * All comms go through the hub * Not every server wants to talk to others **Terminology** * Rooms are the unit of measurement: * They have a state, has a components of * auth policy, * users (participants), * MLS * Participants can be active/inactive **Preemption** * A user may be a participant only if authed * THe client can only be a member of the MLS group if particpant **Confirmation** Different aspects of the state are managed via control sub protocols. Each commit reflect the current room state. All clients have a common understanding of the room state. EKR: Same issue as last time. Imagine the ban use cas. RB: The servers run a little bit ahead of the MLS group in the add case, lags in the remove case, until the device is removed. You can set the logic to make sure it happens. EKR: Goes over a step by step. RB agrees. RM: It's ok if there's inconsistency as we go through the process. **Lego Bricks Slide** New slide. Maps documents to architecture AC: Double checking for the documentation approach. RB: Content and arch being separate is fine. Shouldn't confuse reading too much, maybe a easier for the engineering. DKG: Imagine EKR banned, how does the client know that we need to eject him. RB: Great protocol question. We have to work on that exact description. RM: Architecture should be a separate document. RR: To DKG, both in the protocol and DS A good reason to keep the DS separate for a super clean interface, and can be reused in a different context, on purpose because it handles client to server, not just server to server, which is the scope of MIMI. EKR: The question is that do we specify the semantic transitions in this group? RB: Yes, I think we cannot avoid it. Our building blocks have some flexibility, but it's clear to all of us what playbook we are following. AC: We want to specify, just haven't gotten it yet. RR: Only has to be good enough for adoption, not last call. RM: We need to have one way to do things, but we need to support the actual stuff in the wild. DKG: We are creating semantics that might eventually be surfaced by the client. How is that going to change? AC: Conrad's the man, in his deck. ## MIMI Content Format, Roan Mahy {#mimi-content-format-roan-mahy} **What's new?** * Abstract format for attachments, instead of message/external-body * Added discussion of encrypting external content * clarified difference between render and inline * created message\_id and timestamp. Inner message encrypted, the exteral message as an envelope. Given our purview, we cannot determine the inside. Looking for discussion around this. * Expanded definition on mentions and confusion attacks * lastSeen field Attachments via External Content New binary format modeled on message/exernal-body RFC 4483 sending client encrypts and uploads External content encryption When external content is encrypted * with an ephemeral EKR: They can do a man in the middle attack by replacing the body. It breaks the end-to-end integrity of MLS KK: Do you specify how we generate the message ID? DKG: This is the webbug issue? You can easily discover privacy content if we do this. RM: What do you want it to do? It depends on what your client wants / needs you to do. DKG: We kick these issues all the time, finally to the user who knows not much. We are recapitulate this. CJ: Quotas a really key. DKG is right, and I hate that we kick things down the road. The transfer of large files will make the RIA shut you down. Either you need to solve the privacy problem or the large file problem, but they cannot be simultaneously solved. MH: Matrix has wrestled with this. EKR: Blockchain was a joke. It's a problem, and we still have HTML, so might be difficult to solve the entire problem. Three peopole: sender, hub and the receiving provider. The more you push it away from sender, the more you walk down the chain of who needs to know about it. JL: What would happen if you give the option of receiving very large message. RM: This creates queing problems when you can catching up. RR: You cannot have large messages in a messaging system. The mobile use cases make this a non-starter RB: I read jonathan's proposal differently. Benjamin: Expresses privacy concerns **Sharing messageID and timestamp with providers** * Content format has messageid and timestamp chosen by encrypting client * Expose a copy of this in the MLS AAD * Local or hub ID can reject on duplicate IDs * Message ID is a UUID EKR: Why are we doing this? RM: getting two messages simultaneous with the same message ID, because we use them in replaces and reply-to fields, I can have a message that refers to it. We need it for replaces. DKG: Where do we need a non-hash message ID RM: Clients do get confused and send the same content, at different times. The timestamp is at encryption time. We do have a problem if they don't know if they sent a message due to transactional problems. EKR: Put a pin in this, we can solve that. What does the timestamp do? RM: This is the time the client encrypted the message. We do need a way for sort order Jonathan: Is there a reason you aren't saying that wehave to use a key commmiting AAD and the issues just go away? **Sort Order of Messages** * Is consistent rendering of sort order CJ: Yes, of course you have to have a sort order that is consistent MH: In matrix land, we go back and forth. We basically do not do this. Opposite of CJ. We show in order of receipt. CJ: Everyone needs to see the same thing. The mechanism of ordering is whatever. EKR: How do you resolve the partial ordering question? RM: We have no way ot getting at stuff below MLS, how do we get it from the client? DKG: Either we are giving user interface requirements, or we give the causal order and each client renders ## MIMI Design Team Report {#mimi-design-team-report} Agreed upon previously: * signalling is crypto agnostic, but we focus on MLS * As few documents as possible * Alice and Bob Four Documents * As described above by RM **MIMI Delivery Service** * Goal: specify a relatively generic MLS delivery service * new Interface section with capabilities * Ordering of handshake * Membership management * Proposal-commit logic * MLS specific verification * Tracking public group state * Assistance for joiners * Removed specialized add/remove/update operations * Now: Propose and Commit operations * Simpler interface for use by MIMI procotol **MIMI Protocol** * Room level operations * Signaling basedon events * room state changes based on MLS proposals * signaling proposals take immediate effect on room state * Makes use of MIMI DS * Commits anchor room state with RB: This protocol has a stack of proposals, the delta between the current state and the future state. These proposals must be committed before client comes online to commit them. EKR: I'm confused, you said they would first be removed from MLS, then moved from the higher level state MH: Room state includes participant room changes, should we be tracking items like name or avatar of the room. TR: We are focusing on messaging list. DKG: We need to flesh out the states. **Events** * m.room.user: change a participant list * m.room.info: get room info * ds.proposal: send MLS proposals * ds.commit: send MLS commits **Example Flow** Alice adds Bob example flow. See presentation MH: Events: in matrix, happens in a room. Here, we use it as an RPC mechanism, something that is happening to the room. KK: Yes,it seems like a query response issue. So, events is what we call it? AC: Who and when sets the policy for the room? KK: We assume that policy is set on creation of the group, for now. Prior to the keyfetch. RB: We need to tackle issues of consent, not there yet. RM: Consensus on direction? AC: With ten readers, premature. MH: Forgot to give his feeling, and is supportive of the work that was done. AD: Generally agree that we are in the right direction. Thumbs up. RR: Design team assigned ad-hoc, to resolve these issues, and to my mind this has happened. We need to keep iterating, it's in the right direction, we have reached consensus. ## Agenda Day 2 {#agenda-day-2} **Administrivia** Chairs, 3 minutes Tim welcomes folks to the second session and shows the Note Well and encourages local participants to sign in to the tool so that they can be recognized as present and join the queue. Reminder that folks need to use the mics in order to be heard by remote participants. He then reviewed the agenda and asked for any requests to bash the agenda. Alan Duric joins the mic line and notes that he has been talking to folks about double-ratchet to MLS migration, to highlight best practices. He encourages others interested in that topic to ping him after the session. The group then turned to the open issues. **MIMI Protocol Open Issues** Travis Ralston and Konrad Kohbrok, 30 minutes This was driven by a live view of the github open issues: https://github.com/bifurcation/ietf-mimi-protocol/issues The first issue considered was Issue 23 (tracking arbitrary state). https://github.com/bifurcation/ietf-mimi-protocol/issues/23 Travis asked for feedback; Richard Barnes responded. He suggested that there be a generic extension point that different applications could then use (so not completely arbitrary, but a common extension point). Matthew said that we can't currently predict what features might need to interoperate. We can provide baseline interoperability; it is table stacks to be able to decorate these with other data as required. Raphael noted that he felt the design should be flexibility in regards to encryption; keeping the door open is a good thing. DKG we should be able to handle encrypted shared state for the room. He feels that Daniel's point was true, but that there were interactions between the encryption of the names used to index the values (arbitrary names like "icon" would have known meanings, but arbitrary ones would not.) Konrad notes that the MLS mechanism allows you to get agreement on some things, but it does change from epoch to epoch, and if the encryption of those arbitrary elements are outside of MLS, we have a new key management problem. Rohan Mahy notes that we have much room state in policy; it could be attached as is done for participant list. But he would like to have an existance proof that we can do it adjacent to, in, or separate from the policy document. He suggests we punt on this until we get the mandatory functionality working. He would like to get this done (so, revisit), but not at the cost of getting the blockers done. Travis notes that this would be for after adoption, not part of the adoption call. Jonathan Lennox asked if the same mechanism for associated files applies here. Benjamin notes we have the recurrent need for storage in MLS, and he wonders if we want a MIMI-specific message or a generic MLS group storage/state mechanism. He thinks something should be possible here, but the more generic would be better. Matthew returned to the mic to discuss whether or not there are domain specific needs here, but he is thinking now that we want to design this as generic building blocks. Richard then returned to the mic to ask whether the general stuff in the room is expected to be confidential from the providers; if not, the encryption might not be needed. Jonathan Hoyland noted that designing encryption on the fly probably not great, but he wondered if token-based re-encryption was being considered here? Travis noted that he was expecting something MLS-based and specific (noting that room names and similar would be needed in the MLS context). Jonathan notes that the issue might be the clients might not have the capability to decrypt and re-encrypt after every epoch, but a lightweight token based here might work to ease that burden. Rohan asked again why this is the focus, given the 18 open issues needed to solve the issues already in our way. Travis asked which we should to turn to. Rohan read out a list of names (not captured). The chair asked Travis to name the next one after the final speaker on this topic had spoken. Raphael agreed that this might not be the most critical issue, but discussed the encryption issue here by suggesting we start with encrypting everything and then move on to identify what needs to move into the clear. Experience there would tell use both that set and whether proxy re-encryption is necessary. The group then turned to Issue 33. https://github.com/bifurcation/ietf-mimi-protocol/issues/33. Rohan at the mic generally agreed that if you need to get a key package to someone that getting it through the hub is a good way to do that, but he does not want to exclude doing it other ways, e.g. in response to a knock. So this is sensible guidance, but not exclusive. Matthew noted that this is something Matrix got wrong. There they could end up in situations where you could get messages without having had a connection to the source for the key. So you'd have a message that was unreadable. This is needed functionality, though he agrees with Rohan that it need not be exclusive. Travis: do we have any opportunity for the parties to communicate directly among themselves (referencing a legal agreement point). Chair says that we should focus on building a generic interoperable system rather than focus on the legal impediments which might arise. DKG asks why working through the hub requires additional authentiation. Konrad replies that this is more for upload in a situation where MLS is not in use. If it is MLS, this is not needed as the hub just forwards. Konrad replies that this is true, but it has to be within a group. DKG asked if the issue could be clarified. Konrad agreed to do so and provide a possible solution. The chair concluded this portion of the review, but asked for clarification on the working methods from the design team. There will be some changes based on the discussion at this meeting plus any review that occurs post-meeting. The working group is expected to review the issues and/or enter new issues which it sees as blockers. The request for adoption may occur after another quick revision or this version, depending on the changes identified. **User discovery** **Consensus points thus far** (chairs, 10 minutes) The chair moved on to framing the user discovery problem (slide 5, chair slides). Last time we came to an agreement that there could be a large number of messaging services and that clients wish to discover which service specific identifiers mapt to a service indepdent identifier. We also agree that there is not a full mesh of trusted relationships among the messaging providers (there may be malicious assertions). There are, as a result, two distinct issues: Authentication of the mappings and distribution of the mappings. This in turn implies that there might be both first party and third party mapping providers, so the protocol needs to support both models. Clients also need an efficient way to query mappings. It is an open question about whethe there are any other options that the messaging providers serving as distribution points. Richard: thanks for that summary. An important point to keep in mind that this division of labor means that one provider asserts the validity where there is a distinct role for distributing the mappings. The chair then reminded folks reviewing the proposals to keep these distinct roles in mind. Jon Peterson asked if there is a document planned that will be produced and for which consensus can be called? He likes this, but wonders whether it will be noted separately or elsewhere? After some discussion, it was agreed that it would go into the requirements documents being written by Jonathan Rosenberg. Giles Hogben and Femi Olumofin, 35 minutes https://datatracker.ietf.org/doc/draft-party-mimi-user-private-discovery/ Giles started by noting that this presentation is focused mostly on the distribution problem, as the other problem seems similar enought to a PKI shape. The harder problem appears to be distribution, so that is their focus. He then presented the discovery problem statement (see slides above). He then walked through the diagram of threat actors on slide 3. Gile then reviewed the privacy requirements: from his perspective the most important is that a discovery service providers should not learn the SII a user is querying unless they are sending or receiving to the user being queried. He then looked at requirements for sender platform, receiving platform, non recipient platforms, and third party services (see slide 5). One issue is that clients commonly look at contacts prior to a message being sent, which discloses the social graph connections to parties to whom no message have been sent (and who may never receive a message, since contact lists and message targets may be different). Ben Schwartz asked if the SSI for a given SII is public? Giles--not exactly, but more on that later. It is public to an individual querier, but that catalog of subscribers is not public. Ben says that is a pretty weak privacy goal. It's particularly problematic for small messaging services which might be revelatory. Giles we did have a lot of discussion of this, and we decided it was common enough to assume. Ben disagreed with that characteristics. Giles then said if you don't want to be discoverable you have to avoid disclose the identifier you don't want discoverable. Jon Peterson then reviewed how the SSI and SII mappings work in systems like SMS or MMS, which use enumerable identifiers. Jon notes that the kinds of privacy properties provided by the system on the slide don't align with the privacy issues implied by the framing Alissa supplied. DKG noted that the slide also didn't reflect his understanding of the issue of mapping disclosure; the social graph being exposed by the queries is a core issue. Giles said that this is not the issue being discussed here, but he agrees that that is among the needed privacy properties. He believe that the mapping problem is a better known problem, so he has focused elsewehre. Alissa noted that the other issue had been discussed in previous interim discussions, and that at the moment it is best to simply be clar about which part of the problem is being addressed. Gile then showed a slide listing a requirement that the queried SII must generally be hidden from those not making queries. He then presented the slide on "non-requirements". Hiding SII<>service mapping was listed as a non-goal as it is "public but not publicized", attracting a queue. He said we should support scraping prevention, but it is not feasible in his eyes to make this entirely private. Jonathan Roseberg notes that this is not a boolean. He believes we absolutely require a solution to enumeration; that's one dimension. There are also requirements like rate limiting. He doesn't want to support more queries from a service than would be indicated by the user base of a service. Jon Peterson then asks a question about how the mapping would work from a telephone number to imessage/icloud identifier. Giles says that he doesn't believe that this can be prevented. Jon Peterson says that if we start from the assumption that all of this information is public, it is a non-starter. Mallory says that we are trying to create an interoperable system and that some of the most privacy-focused systems will require much more than an SII. Giles asks whether she means the SSI is required? She'l think more about it. Rohan notes that Giles asserted that you can get a handful of these at a time now; there is a leakage, but it is not enough to create a full map. EKR said that it is trivial to generate enough queries to get past that. Rohan said that he's not sure who is right there, but he is sure that this level of leakiness is that level that we want. If you do, then we need a crisp statement of what the characteristics are. Does that seem reasonable? Giles is not opposed to trying to figure how to solve the problem, but it will be hard to achieve. Alissa then raised the point that what we were talking about before was service reachability--a phone number is reachable, but that this is different from other discoverability control. A provider that controls reachability or an indvidual that controal service reachability impacts that. Jonathan asks what is the difference between service reachability is "does this telephone number have an account on Whatsapp" vs. the account information associated with that account. Whether the identifier is shared or not is this. He then notes that there is a fundamental tension between global reachability among messaging systems and the need to protect the privacy of the data relevant to the SII. Giles notes that "we have to hide it from everyone" may be unachievable. Jon Peterson really wanted to go back to Alissa's point and how the transformation works in real-time for MMS. Ted Hardie then asked about whether we had agreement about whether disclosure was a single party or a two-party requirement. DKG then pointed out that there was an additional wrinkle here in that there could be more than one identifier associated with an SII, and that the mapping between the different identifiers associated with SII was also sensitive. Giles agreed that this was a key point. Vittorio Bertola, 15 minutes https://datatracker.ietf.org/doc/draft-bertola-mimi-discovery-dns/ Vittorio then presented his slides. He notes that it is important to note that these discoveries are of accounts, not users (since a single user might have multiple accounts). He then described a use case in which the user wants to be discovered (see slide 4 1) and so provides a MIMI-specific identifier, and then contrasted that with the case where the user provides a mappable external identifier and the case where a party has an external identifier and wishes to test the availability of a map. He then presented a set of user identifer requirements (see slide 5). Jon Peterson asks if this is predicated on the idea that we are creating a new class of identifiers. Vitorrio responds that this is optional, not required. Vittorio then provider technical requirements and then finally the non-technical requirement for the solutions. He then asks why the DNS would not be useful here. This might work in a hostname-like format, but this would not eliminate the need for oracles for external identifiers. Jon Peterson notes that non-public ENUM is being used today for telephone number routing and that if we did need this, we would not need to create new RRs. Vittorio presents a slide that there is a possible query for a discovery record (of whatever type). He then shows a privacy-friendly DNS-based discovery. The final slide asks whether we want to at least consider the possibility of new, MIMI-specific identifiers. Kaliya Young (identitywoman is her handle)notes that there are many different streams of work associated with identity going on,and she asks how this is connected to this work. She is wondering if SPICE folks might have insights that would help here. Vittorio responds that there is a possibility. Jon asks the questionif we are talking about identifiers that are going to be used by humans, rather than the transformations between those? If those, then his answer is no, we don't need a 57th here to join the club, but we may need transformations to be defined. Alan ends the meeting by thanking the presenters and everyone who has put in time, but also with a plea for early submission of drafts so that the conversations can have a more substantial basis.