Audio/Video Transport Core Maintenance (avtcore) Working Group
CHAIRS: Jonathan Lennox
Bernard Aboba
Virtual Interim Agenda
Tuesday, October 4, 2022
8 AM - 10 AM Pacific Time
Meeting link:
https://meetings.conf.meetecho.com/interim/?short=80d258a6-5351-479a-99d1-a8953ea3d8c5
Slides:
https://docs.google.com/presentation/d/1-QmdrLIpJ48dm-ut2vuAqxPz9Ts8sIcc5m0eiSlye88/
1. Preliminaries (Chairs, 10 min)
Note Well, Note Takers, Agenda Bashing, Draft status, Calls for Adoption
- Please check that Spencer, Harald, and Roni captured your comments
accurately ...
- Zulip is now the right place for text conversations - it's echoed in
jabber
- Two RFCs published, three more in the RFC Editor queue.
- RTP Payload Format for V3C is in "Call for Adoption"
- "Game State over RTP" call for adoption has ended with little
enthusiasm, and draft has expired. Chairs will follow up with the
authors
https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-scip
- -02 reflects resonses to GENART and ARTART reviews
- SECDIR comments arrived September 7, so still working through these
- Security considerations are boilerplate for RTP specifications -
what needs to change?
- Now successfully submitting XML version (yay!)
- Experience has been that authors respond to review comments, but
reviewers often don't respond - AVTCORE is now requesting early
reviews before PUBREQ, to allow time for authors to follow up with
reviewers before IESG balloting.
3. RTP over QUIC Sandbox (B. Aboba, 25 min)
https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-over-quic
- This is an experiment, to get a feel for how RTP will work over
QUIC.
- Minimal implementation (no WASM, JS only). Allows users to vary
settings and see what the results are.
- Post-experiment diagnostics provide metrics and graphs
- Built on "next generation" Web media APIs - result works
surprisingly well.
- Two versions - encode and decode in streams pipeline without
transport, or use network transport
- Supports VP8, VP9, H.264/AVC, AV1. Seeing some oddities with VP9
encode in the "realtime" setting. H.265 encode isn't supported.
- Can experiment with hardware acceleration, latency, scalability,
resolution settings.
- Some good signs on video quality, resilience. CPU utilization can be
an issue with higher resolutions (and with AV1 in particular).
- Partially reliable transport and temporal scalability is a good
combination - most frames are "discardable" so you need not
retransmit them indefinitely. Can set a timer and then send a
RESET_STREAM frame.
- Observed latency MUCH higher than measured frame RTT for higher
resolutions.
- P-frames typically small (a few packets), I-frames are much larger
and have frame RTT multiple times higher.
- Can compare no-network transport to network transport cases.
No-network has much lower glass-glass lateny - so not purely due to
encode/decode.
- Potential explanation (still need to confirm): P-frames are small,
can be sent in a single round trip, close to RTTmin. With GoP = 300,
299/300 frames sent are P-frames. So congestion window stays small.
When much larger I-frame comes along, it cannot be sent in a single
round-trip, so we see frame RTTs several times larger.
- This does not appear to be the result of loss and retransmission,
and congestion window contraction.
- CPU utilization. With high resolutions (and AV1 codec in particular)
CPU utilization is high, at times consuming 100 percent of CPU.
- Investigating use of multiple worker threads (e.g. send and receive
pipeline in separate threads). If the encoder is already
multi-threaded, this may not help.
- Finding issues in the RTP-over-QUIC specifications, and filing them
in Github. Mathis will talk about these: partial reliability,
multipexing of media and data, RTP topology issues.
- Spencer - do we really need datagrams to do RTP over QUIC well?
- Bernard - P-frame round-trip times are close to RTTmin, so that is
working well with frame/stream transport. I-frame frame RTTs are
much larger, though.
- Peter Thatcher: If the issue is that the congestion window is too
small, QUIC datagrams would not perform better.
- Bernard: Yes, that's true. But RTP over UDP congestion control with
probing (as in WebRTC) might do better than RTP over QUIC with
current QUIC cc algorithms (BBRv1, NewReno).
- Jonathan - any input to WebTransport working group or W3C Web API
groups?
- Bernard: Currently browser performance tools do not support
WebTransport, and there is no equivalent of WebRTC internals for
either WebCodecs or WebTransport. So debugging is tedious.
- Sergio - would it make sense to packetize the frame in multiple RTP
packets?
- Bernard: If the issue is that the congestion window is too small,
then packetization wouldn't help.
- Harald - congestion control is linked to congestion control models -
if you're looking at bandwidth based congestion control, it's much
easier to pace
- Bernard - I agree, but I'm not seeing that pacing is the problem in
my experiments, because the congestion window appears to be too
small to dump too many packets into the network.
4. RTP over QUIC (J. Ott, M. Engelbart, 25 min)
https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-over-quic
- Presenter: Mathis Engelbart
- Changes since IETF 114
- WIP - topology, stream concurrency, experessing congestion control
requirments instead of specifying algorithms
- ALPN - Issue #31 - we can't multiplex RTP and arbitrary protocols
over QUIC. Proposals is to define rtp-quic, and define anything else
to other documents
- Multiplexing RTP/RTCP - Issue #24 - If we do "rtp-quic", we can
still multiplex RTP and RTCP over the same QUIC connection.
- Bernard - if you have multiple ALPNs, you can't put that over a
single QUIC connection.
- With respect to multiplexing data with RTP, small updates won't
interfere with media, but
- if you are trying to do a file transfer, then you typically want the
data to not interfere with
- media. We need to think about this a little more.
- Bernard - possibly confusing to have two ALPNs for the same thing
- Bernard - the reason we have session IDs in WebTransport is to
support multiple browser tabs. The session Id allows multiple
instances of the same application to share the connection. Seems
similar to RTP sessions.
- Peter Thatcher - is there a reason why RTP over QUIC is using flow
IDs instead of session IDs?
- Mathis - I do think rtp-mux-quic could make sense, but we can focus
on rtp-quic for now
- Peter - It seems like you could write the draft to cover both RTP
over raw QUIC as well as RTP over WebTransport.
- Mathis - so this is abstracting from QUIC
- Peter - exactly. That's deferring the binding to QUIC connections,
allowing different choices
- Jonathan - while there are some differences between raw QUIC and for
WebTransport, writing the spec to allow both would be sensible
- Mathis - then we'd need another multiplexing for raw QUIC?
- Jonathan - Demuxing with something custom makes sense at the
WebTransport level, but over raw QUIC, I don't think it makes sense
to define a demux until we know what we'd be demuxing with, which I
don't think we're anywhere near yet.
- Mathis - I need to think about this a bit more - let's move on to
the next topic
- Joerg - re signaling, the question is "signaling between whom?"
- Bernard - in a conferencing scenario, the signaling and media go to
the same endpoint. So you might want to do the signaling first, then
send media. There would not be a need to open two QUIC connections.
- Joerg - a lot of decisions we undertook were trying to be simple, so
we wouldn't get wrapped up in different kinds of traffic being
multiplexed over the same connection. I'd like to do something
simple first, and not boil the ocean.
- Bernard - we could just do rtp-mux-quic and recognize that it
doesn't solve all the problems immediately. So, do something simple
and recognize limitations - I'm worried that people will define
their own muxing for anything other than muxing RTP.
- Mathis - would like to hear more from the group
- Spencer - about ALPNs that change behavior versus changing to a new
ALPN. Minimizing the number of connections turned out be critical
for WebRTC, so expecting the same sort of thing here.
- Length field for identifying incomplete frames - Issue #39 - Need to
know when you've received a frame so you can close a connection,
need to allocate buffers, etc.
- Jonathan - does this tell you the difference between a stream that
was reset and truncated, and a stream that was received. Can't think
of downsides, but streams are already doing framing. This is an API
question.
- Bernard - The WebTransport API supports surfacing of RESET_STREAM
as an error, so you SHOULD be able to tell the difference. But the
RESET_STREAM frame might not be forwarded, so it might not arrive,
or if it does arrive, it might show up after the FIN. Draft for
reliable RESET_STREAM has been submitted to QUIC, but still need to
do more investigation. Also, it does not seem that all QUIC
implementations handle RESET_STREAM the same way.
- Harald - length field added from HTTP/0.9 to HTTP/1 - not always the
best way to delimit, because you can't change your mind. Not the
only possible solution, and maybe not the best. If we have a length,
prefer variable length
- Peter - isn't FIN reliable? Should be reliable through the API?
- Bernard - RESET_STREAM is supposed to stop retransmissions of the
stream. Would that include the FIN? We should check the QUIC
specification again - I read it several times and was still
confused.
- Jonathan - undefined behavior
- Mixing streams and datagrams - Issue #41 - current draft supports
both, but not at the same time. Encountered synchronization issues
in texting. Not sure what to do here, want to hear opinions of the
group
- Jonathan - is there a substantive difference between using datagrams
and length<MTU streams? It's one packet either way.
- Mathis - small difference, but the real difference would be larger
frames (length>MTU)
- Jonathan - not sure that we need datagrams anyway
- Mathis - also for less important things?
- Peter - from receiver side, there's no difference. I'm in favor of
mixing.
- Bernard - agree with Peter. If you have a jitter buffer, you should
be able to handle either. Only difference is whether the buffer will
consist of entire frames or mixtures of those with frames still
being assembled.
- Bernard: Stream/frame with partial reliability should get you pretty
close to datagrams anyway
- Mathis - please think about cons
- Next steps - new draft, continue working issues, work with Spencer
on SDP
- Peter - late to the party, but is sticking unmodified RTP frames in
QUIC just silly?
- Jonathan - probably for RTP-QUIC/RTP-UDP interop, plus not redoing
30 years of work ...
- Joerg - trying to retain as much backwards reliability as possible
- Bernard - The interop is one-way: can go from RTP to RTP over QUIC
frame/stream but not the other way, unless the middle box has
codec-specific knowledge so it can packetize. Interop with datagrams
is also tricky because of MTU differences. Might have to
re-packetize between RTP and RTP over QUIC datagrams as well.
- Peter - how do you forward big packets?
- Bernard - middlebox has to have codec specific knowledge. If it is
an SFrame, the frame is opaque, so that wouldn't be possible unless
the middlebox had the key.
- Peter - I'd be willing to think about redefining 30 years of work.
Is anyone else?
- Jonathan - yes, the question is whether anyone wants to do that. New
framing is interesting idea, some quick and dirty implementations
would be helpful in answering the question
- Sergio - just replacing DTLS with QUIC is easy, but the more we
change, the more work on interop we have to do, if it's not even
really RTP any more
- Spencer - I thought we were waiting on topics that involved topology
until we had a finished first version.
- Stephan - the question about redoing framing is a good one, but it's
really BOF material, and would derail work that is probably halfway
to being finished. It could easily become an exercise in boiling the
ocean.
- Bernard - I'd suggest not trying to solve the topology issues, just
recognize them and document them.
- Jonathan - so you don't paint yourself into a corner. Also, MOQ has
been chartered, and might be a better place
- Spencer - and if MOQ isn't a better place (after reading the MOQ
charter), having a side meeting for IETF 115 would be useful if you
do want to propose a BOF anytime soon
https://datatracker.ietf.org/doc/html/draft-he-avtcore-rtcp-green-metadata
- Presenter: Yong He
- Pointing at ISO/IEC 23001-11 (power efficient media consumption)
- Defining two new payload-specific feedback messages - resolution
request and resolution notification
- Presenting draft updates (slide 39)
- Would like to ask for WG adoption of draft
- Harald - what is interaction with scalable video coders? They carry
multiple resolutions in multiple layers, and requesting a solution
can be mapped in multiple ways. Have you looked at this?
- Yong - if stream is not SVC, you probably need metadata to chose
resolutions. Don't think SVC has been deployed much.
- Jonathan - to answer Harald's question, what the codec does to
provide resolution is up to the implementation, but this information
could be useful to the codec in making decisions about how to
provide the resolution
- Srinivas - feedback messages are important, and a few more updates
likely to be presented in the near future
- Jonathan - adopt and then present, or present and then adopt? Is
this different functionality? If so, should probably present and
then adopt
- Stephan - we don't have to wait. Some stuff is still being
fine-tuned in MPEG, but nothing that should affect our decision
whether to adopt now. This is all temporal scalability
- Jonathan - there are like four messages in MPEG, and this document
only has one, right?
- Jonathan - you can provide additional messages in different
documents, but if this maps to one MPEG specification, probably best
to have them in one document
- Stephan - next MPEG meeting is just before IETF 115 in London
- Srinivas - prefer to add messages and then call for adoption
- Yong - don't want to add additional messages
- Jonathan - we could do a call for adoption on this document with its
current contents, and any new messages would show up in another
document
- Jonathan - chairs will do the call for adoption now
6. Wrapup and Next Steps (Chairs, 15 min)
- There were no next steps - see you in London!