Minutes of the AVTCORE WG Virtual Interim Meeting January 28, 2021 ------------------------------------ ### Draft Status Bernard to do IESG writeup of RTT multiplexing draft Jonathan to follow up on status of Tetra codec payload draft Cryptex needs to be submitted as WG draft ### JPEG XS Payload Format Next steps: 1) new WGLC 2) Reviews by Jonathan and Bernard (perhaps others?) ### Framemarking Last Call (from slide) Stephan Wenger: do we still think this is still a good idea, i.e., will it work in practice? Do we need to require future codec specifications to include support for it? (from slide) Sergio Garcia: we have identified some issues with VP8/9. Mo Zanaty: The P/U bit issue triggered a recent update to the draft (re: temporal nesting) Bernard: Aren't all the temporal scalability modes nested (e.g. L1T2, L1T3, etc.)? Jonathan: All the useful ones. Bernard: So the P/U bit issue is only relevant to non-nested spatial scalability modes (e.g. VP9)? Jonathan: Vidyo used framemarking to support introduce the Temporal ID field that was not present in H.264/AVC, so that temporal scalability could be supported in WebRTC.org (to which a patch was contributed). Bernard: This allowed for H.264/AVC implementations in WebRTC to support temporal scalability without having to implement H.264/SVC (a separate codec with additional NAL units). Sergio: Had to do some remapping for VP9, and K-SVC was not supported. Bernard: VP8 implementations require the PictureIDs to be consecutive (this is not the case for VP9). So if an SFU drops frames, it has to rewrite the PictureID and TL0PICIDX fields, or else the decoder might not be able to handle it. This means that an SFU supporting VP8 MUST be able to parse (and modify) at least portions of the VP8 payload. Jonathan: This is an issue with VP8 decoders that could in principle be fixed. Justin: From a Chrome perspective, although I forget the details, I think SFRAME's marking (the Generic Frame Descriptor) was intended to replace framemarking. Bernard: To be clear, the VP8 PictureID/TL0PICIDX issue will afflict *any* RTP header extension used for forwarding. So a GFD, or Dependency Descriptor implementation will still need to parse and modify the VP8 payload. So if the goal is to enable "codec agnostic forwarding without parsing the codec payload" it would seem unachievable (at least without patching VP8 implementations) Mo: I don't think there's anything codec-specific in framemarking that causes a problem. Really a question of whether you have a complex/dynamic scalability structure or not. Framemarking was only designed for simple/static ones, and it can do that for any codec. To Stephan's point, if you are doing something tricky, you really need to expose the codec bytes directly rather than use this sort of generic descriptor. But still would be nice to have something rather than nothing. Sergio: There's still a lot of codec-specific work to be done with framemarking, so that's a problem for implementors. (referring to earlier comments about VP8). Bernard: Jonathan has had some ideas about how to improve VP8 to deal with this issue, but ultimately there are problems when you want to do e2e encryption; because of the VP8 issue you cannot encrypt and integrity protect the entire payload. Mo: The goal here was to allow receivers to get metadata without having to dig into the payload. But the sender always needs to understand the metadata so that it can construct the framemarking header. Jonathan: Do we want to abandon framemarking, or just the part that all payloads need to have a section indicating how they use this mapping? Bernard: Maybe we can get people to weigh in on this. Document is still useful and valuable if only to document what was done, and what we learned from it. Opinions on what to do next? Mo: I would tend to side with use rather than anything else. If people want to use this, or something like this, let's proceed. If nobody wants this, then no. Still there's a lot of activity in AV1 DD and SFRAME indicating there is interest here. Bernard: These follow-on efforts do seem to be encountering some of the same issues identified here. Not sure that those new efforts can solve what we couldn't solve here. Mo: New things may still come up in VVC or elsewhere. Bernard: As Stefan noted, SFU devs don't want to have to parse every codec. Being able to easily support a new codec has agility value, even if every feature of the new codec may not be supportable (or there is some loss of optimality). Is there value in geting a simple solution that covers most cases? Sergio: Use cases drive implementation. VP9 K-SVC is used in webrtc.org screen sharing. If Framemarkimg cannot support VP9 or AV1 K-SVC or VP8, support for temporal scalability in H.264/AVC is not enough to justify keeping the code around, because you will need an alternative that can achieve wider coverage. Magnus Westerlund: Each bit of information needs to be evaluated for its privacy impact. We really need SFRAME users to think about how they package their metadata. Generalizing as much as possible still seems useful. Something to think about here. Bernard: Do you have a specific opinion here? Magnus: If nobody is implementing this, maybe we need to skip ahead to v2. We could publish this as informational. Justin: Definitely value in a generic mechanism if people feel this would cover 90% of cases. There will always be custom cases where you need the SFU to parse raw metadata, need to figure this out in the SFRAME context. Real questino is whether this mechanism covers 90% or 10%. Would be useful to get feedback from Google implementors. Sergio: Don't think framemarking is useful at this stage. Not sure exactly whether releasing as informational or something else is best. Tim Panton: Defining the bits can be more difficult than actually generating them. Doesn't answer question of what to do with the document. Mo: To the 10% or 90% question, the authors naturally feel that this covers 90%+ of "common meeting services". Shouldn't restrict implementations from going beyond. We could say how one cold go beyond if desired. Any proposal will have to have many of the same elements - start of frame, end of frame, etc. Sergio: I disagree with the authors on the coverage here. For example, K-SVC not being supported means it isn't really covering 80% of cases because you can't do screen sharing, which is a common use case. Mo: WG didn't consider VP9 because WG didn't want to try to cover draft payload specs. Jonathan: The issue is that framemarking doesn't give SFU enough information to know what to forward for spatial or K-SVC modes. If you need to have framemarking and something else, then framemarking isn't terribly useful. Bernard: Next steps? Hums? Roni Even: We should have a hum about whether future codecs will need to support this. Jonathan: Let's do the first hum first. 6 in favor of publishing. 2 in favor of not publishing. Jonathan: Weak consensus to publish. 2nd hum: should we publish as proposed standard? Magnus: Hard to answer this question right now. Do we have something better? Jonathan: Hard to know right now, i.e., AV1 DD is still being implemented. Bernard: So we don't know yet whether it can achieve wider codec coverage, or will face similar limitations to framemarking (likely with respect to VP8) Magnus: Perhaps publish as experimental? Jonathan: I had been thinking informational, but experimental could also work. Bernard: Let's hum for PS. 1 in favor of PS. 7 for something other than PS. Jonathan: Should we get into the details on Exp vs Info? Mo: Are we discussing the current doc, or an updated version of the doc, or? Jonathan: Does anyone want to speak to this? Jonathan: We could rewrite this to be the AV1 DD, in an extreme case. Bernard: There's value in documenting what we have. Doesn't make sense to try to fix every issue, that would turn this into something else. Document what it does and doesn't do and leave it at that. Magnus: I think that's the right answer. If there are easy fixes, do that. Otherwise send it out as experimental. Don't hold it up too long. We can replace later if we come up with something. Bernard: We'll bring this to the list, but seems like we have good guidance. ### VP9 Payload Jonathan: Only open issue here is regarding framemarking. Shall we just move this section to the framemarking draft and publish? 4 +1s in the chat for this. Jonathan: That's what we'll do. ### SFrame Youenn Fablet: Sergio and I looking into how to support SFrame. Breaks assumptions for packetizer. Not always tractable to handle this. Is there another approach we can use here? May have some other benefits. 3 main parts: processing model, generic packetization, negotiation. Youenn: Processing layer can split a logical frame into multiple sub-frames with metadata, each of which will be packetized independently. Youenn: Once we have subframes, need to send them. Opaque data, packetizer just splits if needed to fit in MTU. Meta sent as RTP header. Very similar to framemarking. Jonathan: This makes life easier for the packetizer, but does it make life hard for the depacketizer? Youenn: Depacketizer processing is very simple. Aggregate split frames and pass to decoder. Sergio: Some complexity on what a frame is. Let's not discuss what comes into the depacketizer but rather how it will work. Magnus: Quite reasonable to discuss this. This payload format could be misused. This goes from basic payload format to something more complicated. Colin Perkins: Not clear how much of RTP is left once you do this. All the effort in payload formats, does that go away? Seems like a startlingly large change. Sergio: There was agreement we were going to work on this. We're only looking at codec-specific packetization here. NACK, FEC, still the same. Colin: This is a fundamental feature of RTP. Sergio: This is what SFrame is trying to do. Colin: This isn't using 90% of RTP features. Why not make a new thing? Youenn: This reuses lots of RTP - headers, etc. We're giving up some of the benefits of codec-specific packetization, but we think that's not that much. Would be good to hear the specific issues. File in Github would be nice. Colin: We don't need to go to Github to do this. Jonathan: Just get them in writing. Youenn: Let's get precise issues. Colin: You're not using RTP features here. Youenn: I don't understand that. Justin: I think we are using a lot of RTP functionality - headers, encryption, recovery, SSRCs, etc. So let's get the issues identified since making a new thing would have way more unknowns. Magnus: I think you're talking past each other somewhat. It will become clearer how you're reimplementing existing mechanisms. Sergio: Will be doing update of document that will make this all clear (prior to 110) Mo: This is similar to red or FEC - those are wrappers, and this is a privacy wrapper. If we can define how RTP bits work with this transform this may be what we need. Youenn: We have some more slides that speak to this. Colin: What Mo is suggesting may match what is proposed, but I'm not getting that. Need a clearer architectural description, and we're talking past each other right now. Bernard: Time check. Jonathan: We need more overview, let's ask the authors to provide more of this. Sergio: But we did this in the last IETF meeting. Youenn: Not sure exactly what is needed and next steps. Bernard: Next step is to have a draft. Colin: And it needs to describe how it fits with the rest of RTP. The authors think they have explained it but there seem to be a lot of assumptions that are not written down anywhere. Jonathan: Let's get an email with Colin and Magnus. We need to clarify the clarifications. Sergio: We need to understand the doubts. Colin: How does this fit into the RTP architecture? DrAlex: I have an idea, let's take this offline. WG ### QUIC RTP Tunneling Spencer: The filename indicates this draft is targeted to QUIC WG, but if this work is in scope for AVTCORE, wouldn't it get more attention here if it was -avtcore-? Jonathan: Probably, but may need some charter tweaking. Spencer: Seems like an excellent time to have that conversation. Magnus: QUIC WG is mostly for extensions. Protocol mappings should happen elsewhere, e.g., here. Sam: No QUIC extensions needed here. Will repost. Justin: We've looked at this at length over the past 5 years and one of the fundamental problems is the mismatch between QUIC congestion control and RTP congestion control (which are tuned differently for different purposes) Sam: Want to experiment with this. Sam: Interested in any feedback on this document. Jana (in chat): Justin's point is important. Please bring this to the QUIC wg, or at least bring this to the attention of the QUIC wg, and engage there as well. There are many interactions here. Bernard: Please also note that we have both QUIC datagrams and HTTP/3 datagrams. WebTransport WG is going to be focusing on Http3Transport, so there will probably be an additional multiplexing field in the datagram. WebTransport has also discussed the prioritization issue.