# Audio/Video Transport Core Maintenance (avtcore) Working Group {#audiovideo-transport-core-maintenance-avtcore-working-group} CHAIRS: Jonathan Lennox Bernard Aboba IETF 115 Agenda Date: Tuesday, November 8, 2022 Time: 09:30 - 11:30 London Time Session I, Mezzanine 10-11 Meeting link: https://wws.conf.meetecho.com/conference/?group=avtcore Notes: https://notes.ietf.org/notes-ietf-115-avtcore Slides: https://docs.google.com/presentation/d/1BIcHr7XF03vn81DTRRv4-DQv5-niaVCGCi2ZPNqSwwc/ * * * 1. Preliminaries (Chairs, 20 min) Note Well, Note Takers, Agenda Bashing, Draft status, CfAs Notetaker: Magnus Westerlund CfA on "Game State over RTP" (draft-jennings-dispatch-game-state-over-rtp) Jonathan Lennox: The "Call for Adoption" of "Game State over RTP" completed on May 8, 2022. There were only two responses to the CfA, one from Suhas (in favor) and one from Stephan Wenger, who was in favor of adopting the RTP payload format part of the spec, but not the game states (the majority of the document). At IETF 114, the Chairs proposed potentially extending the CfA, if the authors have a proposal to develop a community to move the work forward in IETF. Do the authors want to continue? Cullen Jennings: Defining the RTP format in AVTCORE is possible under the Charter, but apparently there is not enough interest. I agree that only two respondents is not enough. David Schinazi: Why is an IETF RFC needed? Cullen: We need an IANA registry to enable registering future additional game states. Stephan Wenger: I don't think that defining game state formats is the IETF's business. This WG doesn't have the expertise in games to review those formats. This is more like a codec format and that should be done in a body that has interest in this type of work, like AoMedia or MPEG. Cullen: The IETF has done Opus, VP8 and VP9, which bodies such as MPEG did not like. Stephan: I haven't seen any effort in MPEG or AoMedia to get them to accept the game state format. Bernard Aboba: The W3C has defined a game pad API and has had game-related workshops. So perhaps there is expertise there? Harald Alvestrand: AVTCORE is the gatekeeper for registering the RTP payload format. So it should lead, follow or get out of the way. Stephan Wenger: There is a difference between the RTP payload format and the game state format. The AVTCORE WG does not define codecs, this is done by others. I did not object to defining the RTP payload format in AVTCORE; the WG has the competence to review that. Magnus: The RTP payload format is AVTCORE's chartered business, but going through a WG is not the only way of publishing and registering an RTP payload format. That could be done via an Independent Submission. David Schinazi: The IANA registries are open. An alternative to handling the work in the WG would be to publish the document in the independent stream. Bernard Aboba: What would the registry policy be? Cullen: a small number of code points would be Expert Review (with specification required) and larger spaces would be first come, first served. Jonathan: Does anyone object to taking this document to the Independent Stream? No objection. Action: The AVTCORE WG recommends that the document be taken to the ISE. CfA of RTP Payload format for V3C. Jonathan: The CFA completed on October 31, 2022. There were five responses, all positive. There appears to be WG consensus for adopting the document. Any objections? No objections. Action: The AVTCORE WG adopts "RTP Payload Format for V3C". The authors should re-submit the document as draft-ietf-avtcore-rtp-v3c. CfA on “RTP Control Protocol (RTCP) Messages for Green Metadata” Jonathan: This CfA is ongoing, due to complete on November 30, 2022. Please respond to the CfA on the list. 2. RTP Payload Format for the SCIP Codec (D. Hanson, 10 min) https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-scip Dan Hanson: We submitted -03 in October 2022, and have contacted the reviewers in SECDIR, GENART and ARTART to verify that their review comments have been satisfactorily addressed. One reviewer asked whether the reference to SCIP210 should be normative. Cullen: The reference should remain informative, and that should be the general guidance for RTP payload format authors. That is important. We need to distinguish IPR declarations relating to the RTP Payload specification from declarations relating to the codec. Stephan Wenger: The IETF seems to be getting more and more bureaucratic. Bernard: Yes, that is a concern. The authors have messaged the reviewers, informing them of the draft updates. Did they respond? Dan Hanson: No. Bernard: You have done your part. We should have enough information to move forward on the Publication Request. 3. RTP over QUIC Sandox (B. Aboba, 10 min) https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-over-quic At the virtual interim, we presented an experimental implementation of an "RTP-ish" packet format using the WebCodecs and WebTransport API, using only Javascript, no WASM. Latency was good at small resolutions (QVGA/VGA), but at high resolutions (hd, full-hd) there was a visible lag. For example, for AV1 encoding of full-hd at 30 fpbs, with a 1 Mbps average target bitrate, glass-glass latency was measured at 630 ms when the frame RTT was 100 ms. Other curious observations: re-ordering was not observed on the receiver, the observed bandwidth consumption was considerably lower than the average target bitrate. We investigated a number of potential causes and believe we have found the culprit. In the WebTransport API, the writer.write(chunk) promise completes when the chunk is handed off to the QUIC send queue. As a result, await'ing the write promise causes only one frame to be sent at a time, so that I-frames and P-frames cannot be sent concurrently. In effect, this re-introduces head-of-line blocking, even when each frame is sent on its own QUIC stream. If instead of await writer.write(), we use await writer.ready, then writer.write(chunk) and writer.close() the WHATWG streams buffers are kept full and concurrent sending is restored. The glass-glass latency decreases markedly, and re-ordering is now observed on the receiver (e.g. P-frame(s) are received before the first I-frame). Also, bandwidth consumption is much closer (and slightly larger) than the average target bitrate. Another interesting effect is that the excess I-frame RTT observed formerly seems to have decreased. Instead of the I-frame RTT being 2-3 times the P-frame RTTmin, it is now closer to the transmission line. One potential explanation might be that the congestion window in the "after" plot is large enough to allow an I-frame to be sent in a single RTT. In these plots (AV1 at full-hd, 30 fpbs, 300Kbps average target bitrate), the I-frame is of modest size (< 12 KB). A good article on the performance pitfalls of Javascript async await is here: https://www.learnwithjason.dev/blog/keep-async-await-from-blocking-execution A version incorporating concurrent sending and "bring your own buffer reads" can be found here: https://webrtc.internaut.com/wc/wtSender7/ 4. SFrame and RTP over QUIC (P. Thatcher, 15 min) https://github.com/mengelbart/rtp-over-quic-draft/issues/29 https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-over-quic Peter Thatcher: Issue 29 in the RTP over QUIC Github repo relates to SFrame. With RTP over QUIC, it is possible to encapsulate an entire SFrame and send this over a QUIC reliable stream. But what happens if you have some participants in a conference that support RTP over QUIC, and others that support only RTP over UDP? The conference server may then need to convert from large to small. I-D.draft-ietf-avtcore-rtp-over-quic-01 Section 4.1 says that it “may need codec-specific knowledge to packetize the payload of the incoming RTP packets in smaller RTP packets.” What does "codec specific knowledge" mean in the context of SFrame where only the endpoints have the key? With respect to a middle box (to which SFrames are opaque), "may need codec specific knowledge" might really mean “may need SFrame-specific knowledge”. SPacket has the same problem, because there can still be a need to re-packetize. The problem can also occur even if RTP is transported in QUIC datagrams because of differences in overhead compared with RTP over UDP. Summary of the problem: 1. RTP over QUIC allows for big RTP packets 2. MTU differences require re-packetizing 3. Re-packetizing is “codec-specific” (sframe-specific) 4. Problem cannot be solved purely on the endpoints To solve the problem, there needs to be a way to indicate how an SFrame is split between packets, to allow the SFrame to be re-assembled when the packets arrive at the endpoint. One potential solution to the problem is to prepend an SFrame sequence number and SFrame chunk index at the beginning of the RTP payload, after which the SFrame follows. Peter shows an example. Magnus Westerlund: It appears to be reasonable scheme, but it needs a last fragment indicator. Peter: Yes. One approach I looked at was to utilize the marker bit. Jörg Ott: How is the generic problem solved for RTP over UDP? The same problem can occur in translator, correct? It isn't discussed in the topologies draft. \[It is mentioned in RFC 7667 Section 3.2.1.2, Topo-Trn-Translator\] Bernard Aboba: Conference servers (which often have codec-specific knowledge) can re-packetize. Mo Zanaty: How does this deal with authentication and the goal of having only one authentication tag for the whole? Peter: SFrame is end-to-end. A middle box can re-assemble the whole SFrame if it needs to. Harald Alvestrand: SFrame is end-to-end. SRTP handles hop-by-hop encryption and authentication. You should not require SFrames to be verified by middle boxes. That is over-constraining. Sergio Garcia Murillo: You are trying to solve the middlebox issue without having resolved how SPacket would work in general. We should solve how SFrame over RTP works first. Peter: This proposal can also be the basis for the SFrame over RTP packet format. Today there is no way for an SFrame to be split up into chunks and be re-assembled. That is a problem whether the endpoint is packetizing (SPacket) or a middle box is re-packetizing. David Schinazi: QUIC datagram enthusiast. The fundamental problem is that you have MTU issues. The solution is a clone of the IPv4 fragment and we know that is inefficient. Bernard Aboba: The goal of RTP re-packetization is to *avoid* IP fragmentation. David Schinazi: Why can't you do MTU discovery? Cullen Jennings: In a conference, participants can come and go. An endpoint sending RTP does not track the MTU of each participant and it cannot do MTU discovery to each of the (many) participants. However, one solution might be for RTP senders to use the minimum MTU (e.g. 1200 bytes). Peter Thatcher: The use case is a mixed conference where some (most?) participants are using RTP over QUIC, but there also might be some RTP over UDP participants. Jonathan Lennox: This an issue for SFrame over RTP to solve. It is a general issue that applies not only to RTP over QUIC. Richard Barnes: The direction that SFrame WG is going is to only specify the SFrame. There are a lot of ambiguity about how the relation between RTP and SFrame. There should be some possibility for datagram packetization layer path MTU discovery similar to what is done in transport protocol. Cullen Jennings: MTU discovery doesn't work for RTP conferences. Jonathan Lennox: We need an SFrame over RTP draft. I don't care if this is done in AVTCORE or SFRAME WG. Richard noted that SFrame WG is focused on getting the encryption done. Sergio: SFrame over RTP was rejected by the WG. Bernard Aboba: There were suggestions that endpoints do packetization first, then encryption. You had proposed that the endpoints encrypt first, then packetize. Jonathan Lennox: There are RTP payload specifications that allow for delivery of slices. Packetizing first, then encrypting allows those codecs to packetize as designed. Encrypting and then packetizing does not. Mo Zanaty: SFrame needs to support being fragmented to what the lower layer can handle. This is basic to *any* transport. I'd argue that SFrame needs to support fragmentation natively. Spencer Dawkins: We should be looking at RTP topologies and what features we need to support that, in order for SFrame to function. Magnus: RTP over QUIC should not specify how to fragment SFrame. SFrame fragmentation is needed either natively in SFrame or in the SFrame RTP payload format. I think that the SFrame RTP Payload Format should be done in the AVTCORE WG. Sergio: There has been little progress on the SFrame RTP Payload format. Peter: Sounds like people do not believe that the proposal should be submitted as a PR to RTP over QUIC. Should I submit a draft for the "SFrame RTP Payload Format"? Jonathan Lennox: There have been some RTP issues that have not be resolved because of the lack of dedicated side meetings with the right people. Also, with RTP over QUIC streams allowing for an unlimited MTU, it forces RTP middle boxes to re-packetize to a greater extent than what has been done before. Therefore this issues does appear to be something that RTP over QUIC needs to discuss. Action: A side meeting will be scheduled to discuss these issues, as well as a virtual interim. 1. RTP over QUIC (J. Ott, M. Engelbart, 20 min) https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-over-quic Issue 45: The addition of a length field allows multiple RTP packets to be sent on the same QUIC stream. However, this makes it difficult for the receiver to cancel an incoming stream. Bernard: Not sure that preventing a receiver from cancelling streams is a problem. The sender has more information on what it is sending and the appropriate timeout to set. For example, the sender can set a higher timeout for a key frame or base layer frame than an SVC extension layer frame that is discardable. David Schinazi: When the receiver sends a STOP\_SENDING frame, this is to ask the sender to stop and reset the stream. The sender can ensure that it stops on a frame boundary. There might be implementation issues. Lucas: There may be implementation bugs. What is safe here? Jörg: STOP\_SENDING unaligned will only result in loss of the RTP packet that is being received. Cullen Jennings: (after discussion) I prefer option A (accept that a receiver can't cancel streams). The receiver doesn't know how much data is in flight or will be lost. Sending a STOP\_SENDING frame will save some capacity. David Schinazi: I apologize for being too clever. We should have called QUIC Streams "Messages" instead. Doing B (Require only one frame per stream) will have some advantages. Piers O'Hanlon: Sending over datagrams will not have this issue. Jörg: Support for reliable stream transport has been expressed. It is considered desirable. Magnus: Proposal B will limit the session lifetime to a long but finite session length. This should be noted in the document if it is used. I support option B. 2. SDP for RTP over QUIC (S. Dawkins, 15 min) https://datatracker.ietf.org/doc/html/draft-dawkins-sdp-rtp-quic What AVP profiles to register? Would like to conclude on what direction to use here. Asking for Call for adoption. Cullen Jennings: I don't think we are ready for a Call for Adoption yet. 3. Wrapup and Next Steps (Chairs, 10 min) Jonathan: Apologies for running out of time. Chairs to set up a side meeting on SFrame over RTP as well as scheduling a Virtual Interim.