# MIMI Prague   {#mimi-prague}

## Richard Barnes - Archicture Overview   {#richard-barnes---archicture-overview}

**Explanation of CSSC**

*   Each room in hosted at hub server
*   All comms go through the hub
*   Not every server wants to talk to others

**Terminology**

*   Rooms are the unit of measurement:
    *   They have a state, has a components of
        *   auth policy,
        *   users (participants),
        *   MLS

*   Participants can be active/inactive

**Preemption**

*   A user may be a participant only if authed
*   THe client can only be a member of the MLS group if particpant

**Confirmation**  
Different aspects of the state are managed via control sub protocols.
Each commit reflect the current room state.  
All clients have a common understanding of the room state.

EKR: Same issue as last time. Imagine the ban use cas.  
RB: The servers run a little bit ahead of the MLS group in the add case,
lags in the remove case, until the device is removed. You can set the
logic to make sure it happens.  
EKR: Goes over a step by step. RB agrees.   
RM: It's ok if there's inconsistency as we go through the process.

**Lego Bricks Slide**  
New slide.   
Maps documents to architecture

AC: Double checking for the documentation approach.   
RB: Content and arch being separate is fine. Shouldn't confuse reading
too much, maybe a easier for the engineering.  
DKG: Imagine EKR banned, how does the client know that we need to eject
him.   
RB: Great protocol question. We have to work on that exact description. 

RM: Architecture should be a separate document.   
RR: To DKG, both in the protocol and DS  
A good reason to keep the DS separate for a super clean interface, and
can be reused in a different context, on purpose because it handles
client to server, not just server to server, which is the scope of MIMI.

EKR: The question is that do we specify the semantic transitions in this
group?  
RB: Yes, I think we cannot avoid it. Our building blocks have some
flexibility, but it's clear to all of us what playbook we are following.

AC: We want to specify, just haven't gotten it yet.   
RR: Only has to be good enough for adoption, not last call.  
RM: We need to have one way to do things, but we need to support the
actual stuff in the wild.   
DKG: We are creating semantics that might eventually be surfaced by the
client. How is that going to change? AC: Conrad's the man, in his deck.

## MIMI Content Format, Roan Mahy   {#mimi-content-format-roan-mahy}

**What's new?**

*   Abstract format for attachments, instead of message/external-body
*   Added discussion of encrypting external content
*   clarified difference between render and inline
*   created message\_id and timestamp. Inner message encrypted, the
    exteral message as an envelope. Given our purview, we cannot
    determine the inside. Looking for discussion around this.
*   Expanded definition on mentions and confusion attacks
*   lastSeen field

Attachments via External Content  
New binary format modeled on message/exernal-body RFC 4483  
sending client encrypts and uploads  
External content encryption  
When external content is encrypted

*   with an ephemeral

EKR: They can do a man in the middle attack by replacing the body. It
breaks the end-to-end integrity of MLS  
KK: Do you specify how we generate the message ID?  
DKG: This is the webbug issue? You can easily discover privacy content
if we do this.  
RM: What do you want it to do? It depends on what your client wants /
needs you to do.  
DKG: We kick these issues all the time, finally to the user who knows
not much. We are recapitulate this.   
CJ: Quotas a really key. DKG is right, and I hate that we kick things
down the road. The transfer of large files will make the RIA shut you
down. Either you need to solve the privacy problem or the large file
problem, but they cannot be simultaneously solved.  
MH: Matrix has wrestled with this.   
EKR: Blockchain was a joke. It's a problem, and we still have HTML, so
might be difficult to solve the entire problem. Three peopole: sender,
hub and the receiving provider. The more you push it away from sender,
the more you walk down the chain of who needs to know about it.  
JL: What would happen if you give the option of receiving very large
message.   
RM: This creates queing problems when you can catching up.   
RR: You cannot have large messages in a messaging system. The mobile use
cases make this a non-starter  
RB: I read jonathan's proposal differently.   
Benjamin: Expresses privacy concerns

**Sharing messageID and timestamp with providers**

*   Content format has messageid and timestamp chosen by encrypting
    client
*   Expose a copy of this in the MLS AAD
*   Local or hub ID can reject on duplicate IDs
*   Message ID is a UUID

EKR: Why are we doing this?  
RM: getting two messages simultaneous with the same message ID, because
we use them in replaces and reply-to fields, I can have a message that
refers to it. We need it for replaces.   
DKG: Where do we need a non-hash message ID  
RM: Clients do get confused and send the same content, at different
times. The timestamp is at encryption time. We do have a problem if they
don't know if they sent a message due to transactional problems.   
EKR: Put a pin in this, we can solve that. What does the timestamp do?  
RM: This is the time the client encrypted the message. We do need a way
for sort order  
Jonathan: Is there a reason you aren't saying that wehave to use a key
commmiting AAD and the issues just go away?

**Sort Order of Messages**

*   Is consistent rendering of sort order

CJ: Yes, of course you have to have a sort order that is consistent  
MH: In matrix land, we go back and forth. We basically do not do this.
Opposite of CJ. We show in order of receipt.  
CJ: Everyone needs to see the same thing. The mechanism of ordering is
whatever.  
EKR: How do you resolve the partial ordering question?  
RM: We have no way ot getting at stuff below MLS, how do we get it from
the client?  
DKG: Either we are giving user interface requirements, or we give the
causal order and each client renders

## MIMI Design Team Report   {#mimi-design-team-report}

Agreed upon previously:

*   signalling is crypto agnostic, but we focus on MLS
*   As few documents as possible
*   Alice and Bob

Four Documents

*   As described above by RM

**MIMI Delivery Service**

*   Goal: specify a relatively generic MLS delivery service
*   new Interface section with capabilities
    *   Ordering of handshake
    *   Membership management
    *   Proposal-commit logic
    *   MLS specific verification
    *   Tracking public group state
    *   Assistance for joiners

*   Removed specialized add/remove/update operations
*   Now: Propose and Commit operations
*   Simpler interface for use by MIMI procotol

**MIMI Protocol**

*   Room level operations
    *   Signaling basedon events
    *   room state changes based on MLS proposals
    *   signaling proposals take immediate effect on room state

*   Makes use of MIMI DS
    *   Commits anchor room state with

RB: This protocol has a stack of proposals, the delta between the
current state and the future state. These proposals must be committed
before client comes online to commit them.   
EKR: I'm confused, you said they would first be removed from MLS, then
moved from the higher level state  
MH: Room state includes participant room changes, should we be tracking
items like name or avatar of the room.   
TR: We are focusing on messaging list.   
DKG: We need to flesh out the states.

**Events**

*   m.room.user: change a participant list
*   m.room.info: get room info
*   ds.proposal: send MLS proposals
*   ds.commit: send MLS commits

**Example Flow**  
Alice adds Bob example flow. See presentation

MH: Events: in matrix, happens in a room. Here, we use it as an RPC
mechanism, something that is happening to the room.   
KK: Yes,it seems like a query response issue. So, events is what we call
it?  
AC: Who and when sets the policy for the room?  
KK: We assume that policy is set on creation of the group, for now.
Prior to the keyfetch.  
RB: We need to tackle issues of consent, not there yet.  
RM: Consensus on direction?  
AC: With ten readers, premature.  
MH: Forgot to give his feeling, and is supportive of the work that was
done.  
AD: Generally agree that we are in the right direction. Thumbs up.  
RR: Design team assigned ad-hoc, to resolve these issues, and to my mind
this has happened. We need to keep iterating, it's in the right
direction, we have reached consensus.

## Agenda Day 2   {#agenda-day-2}

**Administrivia** Chairs, 3 minutes

Tim welcomes folks to the second session and shows the Note Well and
encourages local participants to sign in to the tool so that they can be
recognized as present and join the queue. Reminder that folks need to
use the mics in order to be heard by remote participants. He then
reviewed the agenda and asked for any requests to bash the agenda. Alan
Duric joins the mic line and notes that he has been talking to folks
about double-ratchet to MLS migration, to highlight best practices. He
encourages others interested in that topic to ping him after the
session.

The group then turned to the open issues.

**MIMI Protocol Open Issues**  
Travis Ralston and Konrad Kohbrok, 30 minutes

This was driven by a live view of the github open issues:  
https://github.com/bifurcation/ietf-mimi-protocol/issues

The first issue considered was Issue 23 (tracking arbitrary state).
https://github.com/bifurcation/ietf-mimi-protocol/issues/23 Travis asked
for feedback; Richard Barnes responded. He suggested that there be a
generic extension point that different applications could then use (so
not completely arbitrary, but a common extension point). Matthew said
that we can't currently predict what features might need to
interoperate. We can provide baseline interoperability; it is table
stacks to be able to decorate these with other data as required. Raphael
noted that he felt the design should be flexibility in regards to
encryption; keeping the door open is a good thing. DKG we should be able
to handle encrypted shared state for the room. He feels that Daniel's
point was true, but that there were interactions between the encryption
of the names used to index the values (arbitrary names like "icon" would
have known meanings, but arbitrary ones would not.) Konrad notes that
the MLS mechanism allows you to get agreement on some things, but it
does change from epoch to epoch, and if the encryption of those
arbitrary elements are outside of MLS, we have a new key management
problem. Rohan Mahy notes that we have much room state in policy; it
could be attached as is done for participant list. But he would like to
have an existance proof that we can do it adjacent to, in, or separate
from the policy document. He suggests we punt on this until we get the
mandatory functionality working. He would like to get this done (so,
revisit), but not at the cost of getting the blockers done. Travis notes
that this would be for after adoption, not part of the adoption call.
Jonathan Lennox asked if the same mechanism for associated files applies
here. Benjamin notes we have the recurrent need for storage in MLS, and
he wonders if we want a MIMI-specific message or a generic MLS group
storage/state mechanism. He thinks something should be possible here,
but the more generic would be better. Matthew returned to the mic to
discuss whether or not there are domain specific needs here, but he is
thinking now that we want to design this as generic building blocks.
Richard then returned to the mic to ask whether the general stuff in the
room is expected to be confidential from the providers; if not, the
encryption might not be needed. Jonathan Hoyland noted that designing
encryption on the fly probably not great, but he wondered if token-based
re-encryption was being considered here? Travis noted that he was
expecting something MLS-based and specific (noting that room names and
similar would be needed in the MLS context). Jonathan notes that the
issue might be the clients might not have the capability to decrypt and
re-encrypt after every epoch, but a lightweight token based here might
work to ease that burden. Rohan asked again why this is the focus, given
the 18 open issues needed to solve the issues already in our way. Travis
asked which we should to turn to. Rohan read out a list of names (not
captured). The chair asked Travis to name the next one after the final
speaker on this topic had spoken. Raphael agreed that this might not be
the most critical issue, but discussed the encryption issue here by
suggesting we start with encrypting everything and then move on to
identify what needs to move into the clear. Experience there would tell
use both that set and whether proxy re-encryption is necessary.

The group then turned to Issue 33.
https://github.com/bifurcation/ietf-mimi-protocol/issues/33. Rohan at
the mic generally agreed that if you need to get a key package to
someone that getting it through the hub is a good way to do that, but he
does not want to exclude doing it other ways, e.g. in response to a
knock. So this is sensible guidance, but not exclusive. Matthew noted
that this is something Matrix got wrong. There they could end up in
situations where you could get messages without having had a connection
to the source for the key. So you'd have a message that was unreadable.
This is needed functionality, though he agrees with Rohan that it need
not be exclusive. Travis: do we have any opportunity for the parties to
communicate directly among themselves (referencing a legal agreement
point). Chair says that we should focus on building a generic
interoperable system rather than focus on the legal impediments which
might arise. DKG asks why working through the hub requires additional
authentiation. Konrad replies that this is more for upload in a
situation where MLS is not in use. If it is MLS, this is not needed as
the hub just forwards. Konrad replies that this is true, but it has to
be within a group. DKG asked if the issue could be clarified. Konrad
agreed to do so and provide a possible solution.

The chair concluded this portion of the review, but asked for
clarification on the working methods from the design team. There will be
some changes based on the discussion at this meeting plus any review
that occurs post-meeting. The working group is expected to review the
issues and/or enter new issues which it sees as blockers. The request
for adoption may occur after another quick revision or this version,
depending on the changes identified.  
**User discovery**  
**Consensus points thus far** (chairs, 10 minutes)  
The chair moved on to framing the user discovery problem (slide 5, chair
slides). Last time we came to an agreement that there could be a large
number of messaging services and that clients wish to discover which
service specific identifiers mapt to a service indepdent identifier. We
also agree that there is not a full mesh of trusted relationships among
the messaging providers (there may be malicious assertions). There are,
as a result, two distinct issues: Authentication of the mappings and
distribution of the mappings. This in turn implies that there might be
both first party and third party mapping providers, so the protocol
needs to support both models. Clients also need an efficient way to
query mappings. It is an open question about whethe there are any other
options that the messaging providers serving as distribution points.

Richard: thanks for that summary. An important point to keep in mind
that this division of labor means that one provider asserts the validity
where there is a distinct role for distributing the mappings.

The chair then reminded folks reviewing the proposals to keep these
distinct roles in mind.

Jon Peterson asked if there is a document planned that will be produced
and for which consensus can be called? He likes this, but wonders
whether it will be noted separately or elsewhere? After some discussion,
it was agreed that it would go into the requirements documents being
written by Jonathan Rosenberg.

Giles Hogben and Femi Olumofin, 35 minutes  
https://datatracker.ietf.org/doc/draft-party-mimi-user-private-discovery/

Giles started by noting that this presentation is focused mostly on the
distribution problem, as the other problem seems similar enought to a
PKI shape. The harder problem appears to be distribution, so that is
their focus. He then presented the discovery problem statement (see
slides above). He then walked through the diagram of threat actors on
slide 3. Gile then reviewed the privacy requirements: from his
perspective the most important is that a discovery service providers
should not learn the SII a user is querying unless they are sending or
receiving to the user being queried. He then looked at requirements for
sender platform, receiving platform, non recipient platforms, and third
party services (see slide 5). One issue is that clients commonly look at
contacts prior to a message being sent, which discloses the social graph
connections to parties to whom no message have been sent (and who may
never receive a message, since contact lists and message targets may be
different).

Ben Schwartz asked if the SSI for a given SII is public? Giles--not
exactly, but more on that later. It is public to an individual querier,
but that catalog of subscribers is not public. Ben says that is a pretty
weak privacy goal. It's particularly problematic for small messaging
services which might be revelatory. Giles we did have a lot of
discussion of this, and we decided it was common enough to assume. Ben
disagreed with that characteristics. Giles then said if you don't want
to be discoverable you have to avoid disclose the identifier you don't
want discoverable. Jon Peterson then reviewed how the SSI and SII
mappings work in systems like SMS or MMS, which use enumerable
identifiers. Jon notes that the kinds of privacy properties provided by
the system on the slide don't align with the privacy issues implied by
the framing Alissa supplied.

DKG noted that the slide also didn't reflect his understanding of the
issue of mapping disclosure; the social graph being exposed by the
queries is a core issue. Giles said that this is not the issue being
discussed here, but he agrees that that is among the needed privacy
properties. He believe that the mapping problem is a better known
problem, so he has focused elsewehre. Alissa noted that the other issue
had been discussed in previous interim discussions, and that at the
moment it is best to simply be clar about which part of the problem is
being addressed. Gile then showed a slide listing a requirement that the
queried SII must generally be hidden from those not making queries. He
then presented the slide on "non-requirements". Hiding SII<>service
mapping was listed as a non-goal as it is "public but not publicized",
attracting a queue. He said we should support scraping prevention, but
it is not feasible in his eyes to make this entirely private.

Jonathan Roseberg notes that this is not a boolean. He believes we
absolutely require a solution to enumeration; that's one dimension.
There are also requirements like rate limiting. He doesn't want to
support more queries from a service than would be indicated by the user
base of a service. Jon Peterson then asks a question about how the
mapping would work from a telephone number to imessage/icloud
identifier. Giles says that he doesn't believe that this can be
prevented. Jon Peterson says that if we start from the assumption that
all of this information is public, it is a non-starter. Mallory says
that we are trying to create an interoperable system and that some of
the most privacy-focused systems will require much more than an SII.
Giles asks whether she means the SSI is required? She'l think more about
it. Rohan notes that Giles asserted that you can get a handful of these
at a time now; there is a leakage, but it is not enough to create a full
map. EKR said that it is trivial to generate enough queries to get past
that. Rohan said that he's not sure who is right there, but he is sure
that this level of leakiness is that level that we want. If you do, then
we need a crisp statement of what the characteristics are. Does that
seem reasonable? Giles is not opposed to trying to figure how to solve
the problem, but it will be hard to achieve. Alissa then raised the
point that what we were talking about before was service reachability--a
phone number is reachable, but that this is different from other
discoverability control. A provider that controls reachability or an
indvidual that controal service reachability impacts that.   
Jonathan asks what is the difference between service reachability is
"does this telephone number have an account on Whatsapp" vs. the account
information associated with that account. Whether the identifier is
shared or not is this. He then notes that there is a fundamental tension
between global reachability among messaging systems and the need to
protect the privacy of the data relevant to the SII. Giles notes that
"we have to hide it from everyone" may be unachievable. Jon Peterson
really wanted to go back to Alissa's point and how the transformation
works in real-time for MMS. Ted Hardie then asked about whether we had
agreement about whether disclosure was a single party or a two-party
requirement. DKG then pointed out that there was an additional wrinkle
here in that there could be more than one identifier associated with an
SII, and that the mapping between the different identifiers associated
with SII was also sensitive. Giles agreed that this was a key point.

Vittorio Bertola, 15 minutes  
https://datatracker.ietf.org/doc/draft-bertola-mimi-discovery-dns/

Vittorio then presented his slides. He notes that it is important to
note that these discoveries are of accounts, not users (since a single
user might have multiple accounts). He then described a use case in
which the user wants to be discovered (see slide 4 1) and so provides a
MIMI-specific identifier, and then contrasted that with the case where
the user provides a mappable external identifier and the case where a
party has an external identifier and wishes to test the availability of
a map. He then presented a set of user identifer requirements (see slide
5). Jon Peterson asks if this is predicated on the idea that we are
creating a new class of identifiers. Vitorrio responds that this is
optional, not required. Vittorio then provider technical requirements
and then finally the non-technical requirement for the solutions. He
then asks why the DNS would not be useful here. This might work in a
hostname-like format, but this would not eliminate the need for oracles
for external identifiers. Jon Peterson notes that non-public ENUM is
being used today for telephone number routing and that if we did need
this, we would not need to create new RRs. Vittorio presents a slide
that there is a possible query for a discovery record (of whatever
type). He then shows a privacy-friendly DNS-based discovery. The final
slide asks whether we want to at least consider the possibility of new,
MIMI-specific identifiers. Kaliya Young (identitywoman is her
handle)notes that there are many different streams of work associated
with identity going on,and she asks how this is connected to this work.
She is wondering if SPICE folks might have insights that would help
here. Vittorio responds that there is a possibility. Jon asks the
questionif we are talking about identifiers that are going to be used by
humans, rather than the transformations between those? If those, then
his answer is no, we don't need a 57th here to join the club, but we may
need transformations to be defined.

Alan ends the meeting by thanking the presenters and everyone who has
put in time, but also with a plea for early submission of drafts so that
the conversations can have a more substantial basis.