Audio/Video Transport Core Maintenance (avtcore) Working Group =============================================================== CHAIRS: Jonathan Lennox Bernard Aboba IETF 111 AGENDA Monday, July 26, 2021 16:00 - 18:00 Pacific Time Session III, Room 1 IETF 111 Info: https://www.ietf.org/how/meetings/111/ Meeting URL: https://ws.conf.meetecho.com/conference/?group=avtcore Etherpad: https://codimd.ietf.org/notes-ietf-111-avtcore Slides: https://docs.google.com/presentation/d/19oE_q0i3--Yf13ESrAcR_CyUeJ7AVaw0sEawLb7_r7U/ ------------------------------------------------- ## Preliminaries (Chairs, 10 min) Note Well, Note Takers, Agenda Bashing, Draft status Note Taker is Shuai Zhao ## Cryptex (Sergio Garcia Murillo, 10 min) https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-cryptex Sergio: Latest version of the specification includes test vectors. Sergio: Two implementations. github PR 29 on jitsi-srtp and PR 511 on libsrtp, fixed one editoral issue. Sergio: Next step: WGLC once -03 is submitted. Bernard: Do the implementations interoperate? Sergio: Not yet. Jonathan: They successfully test against the same test vectors but haven't been wired up to the rest of the signaling and media stack. ## RTP Payload for VVC (Shuai Zhao, 15 min) https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-vvc Shuai: A few bug fixes. -10 draft (latest) is ready for WGLC. Shuai: Base64 instead of Base16 for encoding binary parameters. Shuai: Removed sprop-depack-buffer-nalus. Shuai: New text (10 pages) around SDP parameters. Need new reviews on this (S 7.2.2, 7.2.3, 7.2.4) Shuai: Request for WGLC Justin (as notetaker): which sections need review? Shuai: click on links in the presentation, it will lead you to the added section. ## RTP over QUIC (J. Ott & M. Engelbart, 20 min) https://datatracker.ietf.org/doc/html/draft-engelbart-rtp-over-quic Joerg: - Encapsulation for carrying RTP/RTCP over QUIC - How to correctly interact with QUIC congestion control - Related work: - QRT: QUIC RTP Tunneling (hurst) - RUSH (kpugin-rush-00.html) - SDP signalling - RTP/RTCP - Use QUIC information where possible (instead of RTCP) - Local interface req- - QUIC implementation: must provide information on ACKs, RTTs - Congestion controller: Bernard: Have you looked at the Webtransport API? Joerg: Not yet. what the timelime for webtransport API? Bernard: WebTransport over HTTP/3 has an experimental implementation in Chromium. However, there is no support for stats or ACK info currently, so I don't think it satisfies your requirements. Justin: Can you really do your own congestion control on top of QUIC congestion control? Might be better to have only one congestion control algorithm operating (e.g. turn off QUIC cc) Joerg: there maybe ways of doing that, there is one slide for discussion. - (RTP + RFC8888) vs (QUIC datagram) - Experiments: - using gstreamer over quic-go - experiments on different cc algorithms - signalling: - RFC8122, 8843, 5761 - a SDP example - cc interaction: - how media senders (reliable QUIC stream, QUIC datagram, RTP) interact with QUIC CC - does not have a answer to this question yet. - question: is this useful? maybe a larger topic in media-over-quic Justin: There has been interest in this for years. It is harder than it looks. We have been waiting for QUIC v1 to be done. Now it may be a good time to look at this. Jonathan: what is gain by doing RTP Over QUIC? Joerg: useful in terms of multiplexing RTP streams..experimental study to find out how useful would that be. Sergio: I see the potential, however i dont see how multiple RTP sessions are multipleed in one QUIC transport. Mo: curious, what are the obvious broken things when combine RTP and QUIC Together. Joerg: mostly API Mo: what would it make a good/bad transport. Application own-media stack logic vs QUIC Joerg: hard problem to solve. Bernard: QUIC not only supports datagram transport (unreliable/unordered) but also partial reliability (reliable/unordered), by sending each frame on a separate QUIC stream with a timer (if the frame isn't delivered in the required timeframe, a RESET is sent). RUSH uses this. In this approach, it is possible to differentiate based on frame types (e.g. higher reliability/latency for a keyframe than for a discardable frame). In this approach there is only QUIC congestion control. Joerg: have looked at that option and dropped in current text Justin: I do not favor pursuing this work. Using QUIC itself maybe easier to deploy.(Maybe Justin can confirm if this is indeed his position) ## SPacket (Sergio Garcia Murillo & Youenn Fablet, 50 min) https://datatracker.ietf.org/doc/html/draft-murillo-avtcore-multi-codec-payload-format Sergio: - continue the current work and add support for new codec in the future. - Applying SFrame on a per packet base (SPacket): - SPacket & SFrame have quite a bit in common (80 percent). - SFrame vs SPacket overhead: SPacket has higher overhead (e.g. authentication tag with each packet) - SDP negotiation (negotiate codecs as well as an opaque packet type) - payload type mutiplexing: using a single payload type for all encrypted codecs (with an APT header extension to differentiate then) - Frame Metadata: potential relavent metadata and solutions. - SFrame delta from SPacket. Sergio: what is still needed to be addressed? Mo: So you prefer SFrame? Sergio: we have both, 80% work done for SFrame and all work done for SPacket. We need to pick one. Mo: I think the concern about SFrame isn't just about using a generic packetizer. It is about potentially removing the codec specific elements in existing RTP packet formats. For example, removing the VP8/VP9 payload headers. These headers contain information that may be relevant for recovery that is not necessarily present in the RTP header extension (e.g. Dependency Descriptor or Framemarking). So you're betting on the ability of the proposed RTP header extension to provide metadata adequate not only for a subset of existing codecs but also for future codeecs. Sergio: if we can agree on the elements common to SFrame and SPacket, we have 80% work done for SFrame. Bernard: A question on the "SDP negotiation" slide: opaque could be any codec? how would the end point know which codec is coming? Sergio: You need to first negotiate the clear text codec as well as the use of opaque. A sender can only send E2E encrypted if opaque is negotiated, and then only the codecs that are negotiated. The APT in the header extension lets the receiver know which codec is being received. Bernard: Question about the incompatibility with WebRTC Insertable Streams. Are you saying that SPacket is incompatible with current encrypted transform implementations, or that it cannot work with the API even with implementation changes? Sergio: With WebRTC encrypted transform, the application has no control over the packetization, which SPacket requires. So you'd have to add packetization control to the API. Bernard: Question about the slide "Applying SFrame on a per packet base (SPacket)". The slide shows higher overhead for SPacket. I am wondering if it is required to provide the authentication tag with each packet, or whether we can retain the SFrame approach by providing it with each frame. Sergio: In SPacket, we are doing encryption with each packet (e.g. that is why we need to add the IV), so we also have an authentication tag with each packet. Harald: You are defining two ways of solving the problem (SFrame and SPacket) and you are planning to discard one of them? Is that correct? Sergio: We have all work done for SPacket and 80% done for SFrame. So we need to agree on one of them. Harald: Will anyone implement SPacket? Sergio: We will have one specification at the end. Jonathan: Question about the slide "SDP negotiation". The proposed SDP allows an Offerer who supports E2E encryption to interoperate with an Answerer who doesn't support E2E (by falling back to cleartext payloads). But is that realistic? If you want to use E2E encryption, it seems more likely that you require it and won't accept falling back to cleartext payloads. Jonathan: The metadata is different between SFrame and SPacket. If you use SFrame, frame metadata is useful for decoders. With SPacket, SPacket decoders can use the metadata in the codec itself. Today SFrame only works with 4 codecs; with SPacket you can work with any codec. Sergio: happy with both approaches Youenn: encryption for all streams? probably not, its application-specific. Mo: I don't think we can assume that new Codecs will support SPacket. So it may not in practice offer better codec support than SFrame. Mo: RTP payload formats are inherently different from container formats. When new codecs are created they traditionally develop both an RTP payload specification as well as a container specification. Bernard: Container formats are increasingly used in low latency use cases. The WebCodecs API supports container formats, as does the MSE API, and we are also seeing container formats being transported over the WebRTC data channel. Container formats also support media encryption (e.g. for content protection). Mo: the codec will generate elementry bitstream format, and seperate look in a container format. ## Wrapup and Next Steps (Chairs, 10 min) Do WGLC for VVC-10 and Cryptex-03 drafts Chairs do write up and request publication on Framemarking On SFrame and SPacket ... Chair proposed way forward: Need to get clearer on which paths to take (SFrame or SPacket). Justin suggests: explain SPacket in the draft intro as a conceptual aid, then explain how to get from SPacket to SFrame. Lead the reader to the correct understanding of SFrame. Chair also would like to see the opaque mechanism fully described for one of the underlying payload formats. Chair: EVC status? Stephan: we still need couple of revisions.