Minutes interim-2023-moq-05: Wed 17:00
minutes-interim-2023-moq-05-202302011700-00
Meeting Minutes | Media Over QUIC (moq) WG | |
---|---|---|
Date and time | 2023-02-01 17:00 | |
Title | Minutes interim-2023-moq-05: Wed 17:00 | |
State | Active | |
Other versions | markdown | |
Last updated | 2023-02-23 |
IETF MoQ Working Group Interim 2023 Jan 31 - Feb 1
Day 2, Morning Session
9:30 - 11:00: Agenda Bash and Object model:
From the chairs: Among the open questions:
Do we need to have a protocol-level model for the composition of emissions that is consumed by an application. (e.g. a videoconference with multiple publishers or a composed application of source + translation overlay)
How long-lived or unique do identifiers need to be for a composition of emissions (e.g. Broadcast with both live commentary and closed captions or a Zoom meeting with multiple participants)? How about for a single resource within the composition (the closed captions, or the resources sent by a single member of a video conference)?
How do you re-establish a session?
Where does media format negotiation occur and when in the protocol flow?
How does priority get signaled to end consumers? To intermediate network elements?
- Relax Catalog
- Is catalog/moq generic enough
- [ali]: we are using different terminology that is different from CMAF, which is difficult for people working on both organisation
- [krill]: representation feels like a textual metadata of a metadata, i.e., we should not need another object for this
- [lennox]: how do you explain when several you have cameras of the same scene (left, center, right), the viewer depending on bandwidth can subscribe to appropriate
- [suhas]: ambiguity with the word "track", the representation is trying to resolve
- [roberto]: what questions are these answering? object is bits, track/group, what are they? Kinda understand what broadcast and representation are trying to answer.
- [will]: I helped Suhas write this PR, the hierarchy is Media Session, Media Stream, Media Object.
-
Media Groups and Objects
- [suhas/will]: explaining the concept from slides replated to PR
- WARP: GOP per QUIC stream. Each gop is a group
- RUSH: One video frame per QUIC Stream
- VR/AR want to group a spatial scene as an object
- A group tries to provide a sync point i.e., application use the group as a way for the application to understand how to handle objects.
- in the case of RUSH, the group will be 1:1 object
- in the case of WARP, the group will have multiple objects.
- [alan]: is there a track for one broadcast or multiple?
- [luke]: group seems to be a group of picture. in the case of warp, the QuicStream describes a GOP, so the viewer can ask for the whole stream or start receiving at the beginning for the next stream
- [will]: it is a quick way to identify gop boundaries
- [luke]: if you are sending a frame per quic stream, you will need a way to identify the dependency graph. the group solves that problem, but you can do that with QUIC streams.
- [roberto]: group is good because it can also be applied to audio where audio packets also depend on past packets.
- [mo]: group makes sense as dependency within a picture, so we need to defined a group to be either an internal dependency or external dependency.
- [jonathan]: groups are actually dependencies, i.e., if you have temporal scalability, you do not really need the full QUIC stream to playback, as only some frames are needed. A group in that limited sense may be better than 1 stream = 1 gop.
- Does relay need to understand GOP dependency?
- [christian]: do groups imply sync across tracks, or is there a separate sync mechanism? Codec boundaries need to be defined some how. Is this assumptions true for all components in the architecture (relay, publisher, subscriber)
- [vvv]: decouple groups and syncpoint, i.e., apply a timestamp to a sync point. some how we need to associate the timestamp to an object as well.
- [luke]: SVC of the top layer is dependent on a base layer, so the group tries to solve this problem, but it feels primitive. sync point feels like a higher level construct than just a sync of pictures
-
[chairs poll] -- DOES THE OBJECT MODEL NEED AN ABSTRACT WAY TO REFER TO A SET OF BITS THAT IS INDEPENDENTLY DECODABLE?
- yes was 22 out of 25 that voted.
- [mo]: Independently delivered, i.e., give hints to the transport on what they want to get done. Nuances of open/closed/long/short may not be known to developers and then the main thing is how does the developer map their needs to a quic stream.
- [roberto]: the object model does not need to have the information. What moq needs to provide where you need to look to find the representation. It is easier to find the information from a file or spread it across objects.
- [ted]: interoperability? this would be defined in the protocol, then all the information is in the protocl (versus) a pointer to the informatio.
- [roberto]: this can be in a different protocol spec instead of instead
- [buck/charles]: application may have different needs, mobile devices care about temporal and not spatial given they do not have capabilities to render large videos, while big screens can... so the question is the relay would have to handle the complexity of mixing depending what are the capabilities of the downstream systems.
- [will]: Relays should never decide what to forward. The client should decide what to subscribe to.
- [christian]: are we asking a different poll for sync points?
- [ted]: run the poll later, maybe both are possible. they use the same or different marks -- it is still in front of the working group.
- [mo]: there are specific points that are defined in the container that allows you to seek, so you cannot arbirarily seek.
- [spencer]: we need to understand what relays do (and we have agenda time for that, after the break).
-
[chairs poll] -- SHOULD WE USE THE SAME CONSTRUCT FOR SYNCHRONIZATION AND FOR DEPENCY GROUPING OF OBJECTS?
- no was 19 out of 19 voted.
- [suhas]: should different things be synced together? application would need to handle this to put the timelines together.
- [rajeev]: I dropped out of the second poll, because I do not yet know what makes sense. The application versus a transport layer, I feel if we understand that there is a dependency between the application and transport, we should do make that possible (as to not have non-interoperability).
- [roberto]: unsure if I like the current relationship, would like to understand if the object model is strictly defining a relationship? would like to prefer that the relationship are more loose.
- [luke]: I am unsure if we really need an object model, it is needed for the implementation. What we need are attributes or component that need to be implemented at the relay or at the subscriber or at the producer.
- [ted]: we need names so that we are using consistent terminology, otherwise, we may not get very far. "what question need to be answer, use a number that defines that question, then name the number and not worry about the current names"
10:50 - 11:10: (Approximate) Break
- write stuff on a document to discuss requirements: https://jamboard.google.com/d/1ZQ2dr2zR4N5w40l0odTRVzX1468f_6Tf84oC91nz0_Q/viewer?pli=1
- Luke's terminology https://docs.google.com/document/d/15wCqWAZBN9eqZSdyvMtCQaMr6fsVlC6HocBZZk57YW8/edit
11:45 - 12:30: Relays, Caches, and other middleboxes: Who needs to know what when?
From the chairs: Among the open questions:
- Are they different behaviors for these that require different protocol behaviors? (e.g. a B2B UA might be a client and server from the point of view of the protocol, with an independently addressable set of resources).
- Are there ingest middleboxes as well as fan-out relays? Are there some boxes that do both? Do these need to be addressable by other protocol participants in different ways?
From Alan (privately):
A relay is any service that is both subscribes and publishes media on the same broadcast?
Then there are subcategories:
* replication points (relays that fanout),
* caches (relays that can service subscriptions without refetching from an upstream publisher),
* transformers (relays that subscribe to one set of input tracks on a broadcast but publish a different set of output tracks),
* probably more?Is it fair to say that a moq service that subscribes to one broadcast but produces another broadcast is not a relay, cache or replication point, but in fact, something else?
Spencer's starting point:
- We have MOQ endpoints
- We have "coordinating relays, caches, or replication points" in the MOQ charter.
-
We have "rate adaptation strategies based on changing codec rates, changing chosen media encoding/qualities, or other mechanisms" in the MOQ charter
-
Are we all starting from the same place?
- (Please be "yes" - this is from the charter!)
-
Ignoring the actual names, is Alan's taxonomy helpful? Spencer thinks it is, noting with this taxonomy everything is either a MOQ endpoint, a relay, or "something else".
- [suhas]: relay's MUST NOT have access to raw media!
- We agree that receives and transmits associated media without having access to the raw media.
- [ted]: we can define what we call these things later.
- [christian]: we cannot just be a push model. Are assuming push every where?
- [vvv]: isn't this semantically similar to HTTP Proxy, publishers POST, while subscribers GET.
- [buck/charles]: Does a relay have to always have to terminate a QUIC Session? (controversial?)
- [ted]: common ones that do fan-out they MAY need to terminate QUIC because thse may do congestion control and thus MAY need to. Others MAY NOT need to.
-
Do relays replicate?
- From a previous comment from Ted - usually, but not always
- [luke]: proposal relays in the doc are sending and receiving, nothing else.
- [suhas] "replicate" doesn't mean "fanout"
- [roberto]: relays should not have access to the media. We are missing a whole class of authentication/privacy model -- that would have implications on
- [varun]: there is a strong assumption that there is one producer in the broadcast and the subscriber is viewing the broadcast with one producer? what happens when we want the viewer to watch multiple producers? is that subscribing to several broadcasts or would it be possible for the relay to composite the producers within one broadcast
- [rajeev]: the relay can transfer the QUIC sessions in cases it may not be able to access it. An example of UPnP...
-
[rajeev]: two classes of intermediariers:
- do not have access to the raw media
- have access to the raw media
-
[rajeev]: it seems there are three features:
- do nothing with media, just transfer data
- change the set of media composition
- transform the media data
-
Do relays cache?
- [james]: Yes, but relays should transmit if they are, also specifying:
- what is cached,
- how long should it be cached?
- there should be a way to cache, but the data needs to be addressable.
- [lucas]: masque has a way to do something, we do not need to design them in moq, perhaps a way to accomodate designs from other wg/components.
- [christian]: in the pull model, there is an origin in which the edge/relays will try to either fetch or publish the data. In addition, it will also authenticate if the viewer is able to get the stream and similarly, the relay would ask the origin if the publisher can publish the content.
- this requires a different type of box which is part of the architecture. ?? Authorization and discovery of data.
- [james]: Yes, but relays should transmit if they are, also specifying:
-
Do relays perform rate adaptation (isn't this going to be media-type specific, by definition?)
-
Is there any "something else" entity that needs part or all of the media-level metadata, but does not need all of the media, in order to function?
- {Rajeet?] - yes, a mobile operator may want to remove some tracks, etc. for media entering their networks without decoding the media itself
-
[chairs poll] VIRTUAL INTERIM BEFORE IETF 116?
- Yes, 22 out of 25 total.
12:30 - 1:45 (Approximate) Lunch, not provided
- [Alan to Draft owners] Publish a new version (revision) of draft with some learnings we had, and perhaps ask that for adoption in the next IETF meetings as starting point
- [Alan Idea]: Create small design team to fix problems that we talked about and try to figure out solutions for those
02:00 - 03:15 Open issues
-
[54] Add text to talk about relays and Pub/Sub
- [Spencer] Draft would keep things high level
- [Suhas] Relays needs to be taken into account in MOQ
- OWNER: Suhas
-
[57] Slice as segment
- [Luke] For real time conferere you want to have smaller objects than video frames (slices)
- [Mo] Real issue is the protocol should NOT encoder to have lengths upfront (so avoid waiting to have all data to send it)
- [Victor][Luke] we need to fix the current spec
- [Will] We need to do progressively transfers (chunked transfer), we should not specify more than that. A QUIC stream per object allow us to drop the tail.
- [Kirill][Luke] And object with 0 length goes to the end of the QUIC stream, also we could have length
- OWNER: Suhas
-
[58] Marking references (dependency graph)
- [Luke] Can we use just hint (IDR points) instead of using full graphs
- [Christian] We should express some kind of ordering in a way is easier to implement
- [Mo] Not sure MOQ is the right place to solve this, since there are other places (codecs) that are working in that
- [Suhas] We need to make it codec independent
- [Victor] We should keep complexiy low, what are the benefits of adding more complexity?
- OWNER: Christian
-
[64] Catalog NOT enough for player selection
- [Will] Could be a problem saying that all selection critiria is inside CMAF data, because this is NOT owned by this group and it could slow us down MOQ evolution
- [Victor] It seems is an ownership problem
- [Suhas] Base document can refer to CMAF, but we can be open to negotiate new formats
- [Ted] We could use registry for this
- [Rajeev] We need to make it easy for everybody to migrate from current to MOQ, using current known formats (CMAF) can help
- [Kirill] The current info in CMAF is NOT enough for clients, we could add it, but is NOT there today
- [Mo] For dynamic cases (VC) it can get a little confusing how to properly encode that in a static approach like CATALOG. It would be great to see a proposal for more dynamic scenarios
- OWNER: Will
-
[66] Relax the CATALOG definition to decouple the base protocol from the streaming format
- [Will] To quicky evolve we should make CATALOG a contract between source & dest, keep the relays out of that will help us move faster. We could create a separate doc to specify CATALOG. Have a URI, version, and opaque (at the base protocol level) payload. Relay only read base protocol
- [Rajeev] This created 2 types of relays:
- The ones that understand the "opaque" payload
- The ones that do NOT understand the "opaque" payload
- [Alan] Can we avoid sending catalog? It seems yes if we figure out a way to send INIT segments out of band
- [Roberto] We should be explicit if we encrypt something, so say why
- [Kirill] We could keep the catalog out of band
- ONWER: Victor
-
[68] Data Model needs to expand to support different use-cases
- [Suhas] Think different use cases that data model can conver
- [Luke] This is trying to cover too much land, it should be broken into mode tackable problems
- [Alan] Group is an element in the hirarchy that is missing in the current version
- OWNER: Luke, Suhas, Victor
-
[70] Globally unique broadcast URI
- [Luke] Why do you need global uniqueness?
- [Roberto] What are we trying to solve in this PR
- [Will] It is NOT needed to be GUID, it should be unique enough as URL (ex: www.cnn.com/stream/111)
- [Ted] Videoconferences use case, publisher that does NOT control a domain, what to do there?
- [Jonathan] Is this actually a URL?
- [Ali] In some other RFC 7022 provides global unique numbers
- [James] Using QUIC (TLS) contraints the possibilities of using DNS
- [Rajeev] The whole peer2peer case we can enter into challenges, the rest of use cases seems OK
- [Suhas] Relay discovery seems out of scope
- [Mo] Let's just call it identifiers
- [Roberto] Authorization and security are also related to this "IDs" design
- [Lucas] If you are using WebTransport you are already using a URL
- OWNER: Ted, Suhas
03:30 - 04:15 Open issues continued
Interim before Yokohama
- TBD date and Timezone, if you have blockers due to other engagements, please report them. [jonathan] noted an AVT interim in Feb as one blocker.
-
[78] Definition of Messages in Section makes too many assumptions
- [Christian] Pay attention to termination condition, very useful for testing to make sure to know what you lost
- [Luke] We need to define how streams work in the data model
- [Jonathan] Knowing what you didn't get tells you a lot about what you got and what you have to do about it
- [Suhas] We need to expose something in the API about what is missing
- [Alan] What messages in the draft are that can be partially reliable?
- [Luke] Objects are the only ones
- [Christian] Separate the data model and the message format. The transports might use different formats, but the data items have to be expressed
- [Ted] Do we have a strategy if QUIC is not available?
- [Rajeev] That QUIC failover could also happen in the middle of the connection
- [Martin] If it's over WebTransport, we fall back to HTTP/2 naturally
- [Suhas] Separate how things are related (ex: GOP) and how things are mapped to transport
- [Luke] Totally allowed to drop the end of a frame. Please see if we need multiple modes before we add them. Let's do one first.
- [Jonathan] When you are dropping is interesting to know the dependecies (what is useful or not), that adds complexity (tradeoff)
- [Christian] Add metadata to a frame that indicate dependencies, when we drop it we know the dependecies. Objects are atomic if they're encrypted.
- [Alan] Need to express ideas about what is useful to consume partially.
- [Christian] Great to end with one way, but start with many
- [Luke] The point of current draft is some compromise, better NOT to fork the protocol and find a middle ground
- [Suhas] Have a data model, leave the transport details to the application
- [Roberto] We need to think in transactional overhead of having lots of small objects
- [Alan] An important thing to understand the smallest unit to drop or to undestand (that has meaning)
- [Mo] Several reasons for the "atom". What is useful, plus encryption auth tags.
- [Suhas] Object should be that (smallest) atomic unit. And it can be composed with N packets
- [Luke] GoPs can be thousands of packets. Encoding determines what's droppable.
- [Jonathan] Argue that GOPs are NOT atoms because the tail can be dropped. There are some cases where frames are NOT even atoms (Slices)
- [Luke] would you buffer a whole frame to make sure that it's arrived. [no] so you'd run the risk
- [Jonathan] If I make a decision to drop, I drop the whole atom
- [Roberto] Careful with the semantics, come up with the right terminology
- [Suhas] Separation about depedency and transport mappings is useful. For instance GOP = QUIC stream
- OWNER: Christian, Suhas (at least 1st triage / breaking down)
-
[82] Broadcast URI in Object message
- [Kirill] Concerned about the length of URL and object overhead
- [Christian] There is consensus in the need to reduce overhead
- [Will] Relays would benefit of having the whole URI, we need it to create the cache key to uniquely identify the object
- [Alan] Are we ok on putting URI in the objects, and solve optimization problem after adoption
- [Ted] I think this seems optimization, not architectural, agreement to move to to performance issue
- [Jonathan] The object an perhaps have only trackID
- [will] what if two publisher use two track IDs?
- [Lucas] Lots of existing work here, we can solve this. What is the performance objective?
- [Roberto] 2 issues we are concerned in this PR
- Avoid overhead
- Put this in a consistent and simple way for relays
-
[Alan] Session restart and reliability were on the agenda, will not happen today.
-
[Ted] Issues for design teams (hopefully having this solved before next interum (~march))
- Taxonomy and terminology
- This also includes (or is very related) to object model. Some agreement (Will, Ted) that object model is very big priority
- Team members: James, Spencer, Luke, Ali
- Intermediaries
- Definitions of network elements (what they do, don't do ...)
- Team members: Will, Suhas, RajeevRK, Hang, Varun, Spencer
- Taxonomy and terminology
-
[Will] Will open issue and try to clarify object model
- [Ted] We need to extract away from existing systems and try to propose some primitives (building blcks) that allow us to solve MOQ problems
# Wrap up!
[Alan] Good progress, probably NOT the one everybody desired!
THANKS TO EVERYONE!!!!