Minutes interim-2023-avtcore-01: Thu 16:00
minutes-interim-2023-avtcore-01-202302231600-00
| Meeting Minutes | Audio/Video Transport Core Maintenance (avtcore) WG | |
|---|---|---|
| Date and time | 2023-02-23 16:00 | |
| Title | Minutes interim-2023-avtcore-01: Thu 16:00 | |
| State | Active | |
| Other versions | markdown | |
| Last updated | 2023-02-24 |
Audio/Video Transport Core Maintenance (avtcore) Working Group
CHAIRS: Jonathan Lennox
Bernard Aboba
Virtual Interim Agenda
Date: Thursday, February 23, 2023
Time: 08:00 - 10:00 Pacific Time
Notes:
https://notes.ietf.org/s/notes-ietf-interim-2023-avtcore-01-avtcore
Meeting link:
https://datatracker.ietf.org/meeting/interim-2023-avtcore-01/session/avtcore
Remote instructions:
https://meetings.conf.meetecho.com/interim/?short=4da9b4c1-62cd-458e-8a8f-58fa91cf8ce4
Slides:
https://docs.google.com/presentation/d/1QAo7WiUmIfKWYp5ntV37aYvtnSuyZ0gNqrcrnQS4KkM/
-
Preliminaries (Chairs, 15 min)
Note Well, Note Takers, Agenda Bashing, Draft status -
RTP Payload Format for SCIP (D. Hanson, 15 min)
https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-scip -
RTP Control Protocol (RTCP) Messages for Green Metadata (Yong He, 15
min)
https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtcp-green-metadata -
RTP over QUIC (J. Ott, M. Engelbart, S. Dawkins, 25 min)
https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-over-quic -
Viewport and Region-of-Interest-Dependent Delivery of Visual
Volumetric Media (S. Gudumasu, 10 min)
https://datatracker.ietf.org/doc/html/draft-gudumasu-avtcore-rtp-volumetric-media-roi -
Wrapup and Next Steps (Chairs, 15 min)
Draft status updates
- VP9 payload is in MISSREF, waiting on framemarking draft.
RTP Payload Format for SCIP (Dan Hanson, Mike Faller)
IESG Ballot has completed, 3 DISCUSS comments.
- In SCIP packetization and rate control is handled by the underlying
codec. If the underlying codec (e.g. G.729) does not have the
ability to control rate, then the ability of SCIP to respond to
congestion will be limited. - SCIP supports both control and data traffic. Control traffic handles
key management as well as negotiation. Since control traffic has
structure as defined in the SCIP documents and includes a length
field, so that SCIP control messages can be split between RTP
packets. Data traffic consists of audio/video as well as chat and
whiteboard, is encrypted and therefore appears to RTP as an opaque
blob that can be split between RTP packets. This means that the SCIP
stream can appear differently depending on the state of SCIP and
whether the traffic represents control or data. - SCIP authors are looking for WG review of the proposed changes
before publishing a revision 5. - Bernard Aboba: Would be useful to take a look at the ballot
positions. The IESG is looking for a standard RTP payload type
document, with sections on the RTP packet format, RTCP feedback,
etc. but they didn't get that. - BA: The SCIP authors appear to have responded to Franchesca's review
comments relating to change control. Roman is asking if the traffic
is entirely opaque - once you've exchanged the keys, it is entirely
opaque, correct? - Dan Hanson: Yes. But the control traffic has structure, as described
in SCIP-210. - BA: While you don't have to go into detail on the key establishment
process, I do think you need to describe how control traffic and
data traffic are packetized. Once keys are negotiated, the data
traffic is entirely opaque. Packetization is handled by the
underlying codec, which hands the payload up to SCIP for encryption
and placement in the payload field. So if the codec is H.264, an
H.264 packetizer is providing a payload in RFC 6184 format to SCIP
for encryption. - BA: You may need to say a little bit about how the SCIP are
fragmented into RTP packets. - DH: The messages have their own header with message length.
- BA: So you will need to describe how the SCIP control messages are
split between packets (e.g. how fragmentation works) as well as how
the fragments are de-packetized. - BA: There is another comment from Roman about secure session
establishment protocol behaviour. I'd suggest talking about the
basics, but I don't think you need to reproduce the SCIP state
machine, because you just need enough to describe how packetization
and de-packetization works. - BA: Sarker's comment asks what RTP profile should be used. Also need
to specify exactly what "highly variable" means. - Jonathan Lennox: The RTP profile will depend on the underlying
codec. - BA: However, SDP does not reflect that, because the negotiation is
handled within SCIP. - Mike Faller: We've been throwing the word "codec" around a lot, but
that underlying codec could be a chat session, so it's better to
think of it as data being carried by SCIP. - BA: Yes, since the payload can represent media or chat or
whiteboard, data is probably a better term. - BA: This seems like a question that a description of the overall
architecture and layering might address. - MF: What do you need to know about SCIP to implement it?
- BA: Since negotiation of the underlying codec is handled by SCIP,
the RTP profile does not need to be negotiated in SDP. Also, that is
why the traffic is "highly variable". - MF: One of the fears from networking equipment vendors is wanting to
know how to filter on SCIP, but you can't as it's so variable and
changing. - Jonathan Lennox: Need a statement in the document to just not try.
- BA: QUIC deals with this in the "ossification" section - you could
add a section here telling implementers to not bother with deep
packet inspection because the traffic can vary depending on whether
it is data or control traffic. Attempting to parse opaque data is
pointless.
RTP Control Protocol (RTCP) Messages for Green Metadata (Yong He)
- Jonathan Lennox: Question from Nokia - if there was any negotiation
of the resolution in SDP, how does that interact with these
messages? Probably that you should never go above the resolution
which SDP allows. - JL: Question from Magnus Westerland about the format
- JL: Better to call the document "Temporal-Spatial Resolution" as the
Green Metadata effort within MPEG is more than just TSR.
RTP over QUIC (Mathis Engelbart, Spencer Dawkins)
- Mathis Engelbart: Had feedback that the congestion control section
needed to describe what kind of congestion control was required, and
which layer should perform congestion control.
Spencer's summary of controversial ideas in congestion control:
- "Disabling QUIC CC" - obviously, any implementation can do
anything at its end, but there's not a defined way to tell the OTHER
end to do that. So, how do we make sure we're talking to an "other
end" that will Do The Right Thing? Port numbers? ALPN (as is in the
draft today)? Just assume the RTP sender will never cause QUIC CC to
kick in, because it's interactive media, and think happy thoughts?
Other ideas? - Conforming to RFC 8085 (because we're running over UDP with little
or no congestion control happening there). - Is there a MTI rate adaptation algorithm for RTP-over-QUIC? -
Spencer doesn't think so, because NADA and SCREAM are IRTF-stream
algorithms. For some value of "we", "we" could talk to the authors
about how to bring this through the IETF. Are we going to do that?
Spencer also notes that Christian Huitema has a proposal for
media-aware CC in MOQ now, so it may be too soon to pick an
algorithm now.*
- Spencer Dawkins: Been talking about disabling QUIC CC for years. Any
implementation can do whatever it wants at its end, but no way for
the peer to be told to do the right thing. Could be in ALPN, but
some people don't like that. - SD: What would we need to do to conform to RFC 8085 as we're running
over UDP, and turning off QUIC CC would require understanding of
what the app layer will do - Bernard Aboba: In QUIC, both sides don't need to use the same
congestion control algorithm. The CC algorithm isn't negotiated, and
yet the two sides can interoperate. On disabling QUIC CC, is this a
serious thing that people want to do? We've talked in various places
about trying to choose CCs that would work. If you are compiling a
QUIC library within your application, you could remove the QUIC CC
algorithm and handle CC in the application instead, but it's highly
unlikely that a browser would allow an application to disable QUIC
CC. As an example, in WebTransport, there is no constructor argument
for "no congestion control". - SD: There's two layers to this - using the language of disabling
QUIC CC, but how do all the entities in an implementation know who
is doing congestion control, and how to know that's not appropriate
for H3? I.e. don't use BBR or other CCs that do bandwidth probing. - BA: In the case you've just mentioned, it's possible for one peer to
use BBR and the other to use New Reno. The Probe RTT phase would
destabilize rate control (the application sees loss or increased
delay, so it reduces rate, then probe RTT kicks in, delay goes back
up and application will reduce rate even more though it is not
necessary). So rather than worrying about CC negotiation, the draft
should instead describe what algorithms are likely to provide good
results. - SD: Should we have a mandatory to implement rate adaptation
algorithm? Spencer doesn't think so, but wondering if others agree? - BA: Rate adaptation depends on what mechanisms the encoder provides.
The encoder may support scalable video coding, which enables
dropping or adding of layers, or it may support per-frame or
per-macroblock QP control. Or the encoder might just allow changes
in resolution or framerate. So a given rate control algorithm may
not be implementable if the encoder doesn't offer the required
controls. - SD: A lot of the work in RMCAT (NADA, SCReAM) was done on how to not
break other flows of NADA or SCReAM, but not necessarily what would
break other congestion controllers. - Peter Thatcher: I don't think this is a question of whether we're
turning off QUIC CC - what is the feedback it's going to use to
implement feedback to RTP - is that going to be feedback from QUIC
or from RTP? Are we going to extend QUIC to have the timestamps
necessary, or is it going to be feedback embedded in streams or
datagrams. How implementations use that feedback should be up to
that implementation. - SD: Agree with Peter. I want to improve the quality of discussions
by having better descriptions on issues on github. We haven't had a
lot of conversation that focuses on all three options. - Jonathan Lennox: Agree that whether you have congestion control at
the RTP level - I think the desision of how much is available to
send can happen at either end, but the decision of what to send
needs to be decided by the sender only and this needs to be
clarified. My expectation is that a CC like New Reno or CUBIC, and
if you run a rate-based CC like GCC or NADA on top of that is that
they'd work together well, but not with BBR. The issue is
communication between the layers. - BA: Key frames cause the most issues because they are so much larger
than everything else. Average Bitrate Targets are just averages, but
since keyframes are 10+ times larger than P-frames, keyframes are
much more susceptible to loss and queuing delays. Also, when sent
over QUIC reliable transport, the size of the congestion window is
useful to know, since keyframes may require multiple roundtrips to
send. For exampe, at high resolutions, the keyframe might be 150 KB
and if the cwind is 15 KB (10 packets), that implies 10 roundtrips!
This can dictate the startup latency, or the length of a video
freeze that will be experienced in event that a keyframe is needed
to switch between streams or to recover from loss. W3C WebTransport
WG has been working on metrics to provide to the application. - SD: Going to have an arms race between people wanting to send more
and those wanting better compression. The very biggest frames that
we send are getting bigger over time, until someone comes up with a
way to amend that. - BA: Better compression means you can either keep the quality the
same and save bits, or use the same number of bits but do more with
it. In many applications (such as 4K gaming or AR/VR) the
inclination will be to provide higher quality, so the bitrate won't
go down. - SD: Mathis and Joerg have already done some due dilligence with QUIC
CC, and noted it wasn't kicking in on some representative RTP
traffic. It didn't, but we need to keep an eye open in case that
starts happening. - BA: The amount of concurrency is an issue. If you're sending the key
frame at the same time as P-frames, this allows the pipe to stay
full even if the keyframe is experiencing loss and retransmission.
While this may drag out the length of time needed to send a key
frame, it can decrease glass-glass latency because the pipe is kept
full, and the cwind grows or recovers more quickly (since it being
pushed up by the concurrent sending of P-frames).
Viewport and Region-of-Interest-Dependent Delivery of Visual Volumetric Media (Srinivas Gudumasu)
- Jonathan Lennox: Are you seeking adoption?
- Srinivas Gudumasu: Not at this time, just looking for feedback.
- Spencer Dawkins: Is this the rtp-v3c-00 draft?
- SG: No this is on top of that
- JL: Might be better for MPEG to define what a region and the
viewport is - make sure you have the correct experts looking at it. - SG: Already working on this.
- Mo Zanaty: You were talking about coordinates system - does V3C
describe in terms of meters? Or are they arbitrary and not based on
real-world sizes? Might be not interoperable when looking at an
observer's real world dimensions. - SG: The sender converts the dimensions and translates them into the
coordinates for the spatial regions. - Mo Zanaty: Needs more wordsmithing to make this more obvious that
it's images and pixels relative to the media itself.
Action items
- SCIP authors to add material to the introduction providing an
overview of SCIP, section on RTP payload format (including
packetization/de-packetization) answering IESG questions.
Log of Zulip Chat
Sam Hurst
I can try taking notes - will be my first time for an IETF meeting but I
can give it a go
11:05:52
Dale Worley
I'm not hearing any audio, though the browser and Meetecho think I've
got sound enabled and Meetecho reports multiple kbps. I heard the
previous speaker.
11:17:11
Jonathan Lennox
Dale, is that still the case?
11:19:14
Dale Worley
Yes, and toggling the immediately obvious controls doesn't fix it. I am
getting the notifications as to who is speaking.
11:20:12
Ugh, my UI problem. The "Mute Audio" icon in the upper left is "mute
audio to me" whereas the same icon in the lower right is "mute audio I
am generating".
11:22:23
Sam Hurst
My audio glitched out for a moment there - what doesn't matter (so I can
add it to the notes)
11:33:03
Rui Paulo
we can also tell the other side to disable CC via QUIC transport
parameters
11:50:17
Joerg Ott
That would mean that a QUIC library without RTP would be able to signal
that, too, which we may not want to suggest. No CC at all would not be a
good idea.
11:51:10
Peter Thatcher
This isn't really about "disabling QUIC CC". This is more about "the
congestion control for the RTP packets is done using feedback other than
QUIC feedback". Whether it's "in QUIC" or "out of QUIC" is an
implementation detail. The real question for a standard is: what
feedback is being used. Is it QUIC feedback or RTP/RTCP feedback?
11:54:30
Joerg Ott
Right, I am saying only that you should not be allowed to negotiate
turning off QUIC CC inside QUIC transport parameter negotiation
11:55:39
Peter Thatcher
For example, in the WebRTC world, we don't standardize googc. We
standardize transport-wide-cc (or whatever was the result of trying to
standardize it :).
11:56:01
I agree it doesn't make sense to negotiate turning off QUIC CC.
It might make sense to negotiate some kind of QUIC version of REMB,
though.
11:56:58
Harald Alvestrand
well, we failed to find the resources to actually propose transport-cc
for standardization, so that kind of died.....
11:57:10
the proper view is probably that we should have a congestion limitation
as part of QUIC, but a decision on what to send next (or decide not to
send) as part of the application.
11:58:21
Peter Thatcher
Options for feedback:
A. REMB style: have the receiver tell the sender a number. Probably need
a timestamp similar to the RTP abs-send-time header extension attached
to each packet for the receiver to make a good calculation.
B. transport-wide-cc style: have the receiver send per-packet receive
timestamps in ACKs. Let the sender do the calculation.
I think in either case, extending QUIC will be much better than doing
this above QUIC.
12:02:25
I think this works for B:
https://www.ietf.org/archive/id/draft-smith-quic-receive-ts-00.txt
12:03:52
Joerg Ott
B. is what we have in the draft
12:05:15
including quite a bit of discussion on what to inherit from the QUIC
layer
12:05:40
Vidhi Goel
How big is the Pframe?
12:06:12
Joerg Ott
Of course, good ol' RTCP is also an option when you need something that
QUIC doesn't give you (yet)
12:06:29
Peter Thatcher
Joerg: That sounds like a good approach. Why don't we just try and
implement that with, say, googcc, and see if it works well? I'm guessing
it will. What more do we need?
12:07:08
Mo Zanaty
MOQ has a similar need / delimma. That argues for solving this within
the QUIC CC and feedback layer, not above it.
12:07:53
Peter Thatcher
And I'd like googc to work in WebTransport as well.
12:08:14
Mo Zanaty
For receive timestamp extensions to QUIC, there is also Christian's
original proposal:
https://datatracker.ietf.org/doc/draft-huitema-quic-ts/
12:09:10
Peter Thatcher
If anyone is interested in adding support for
https://www.ietf.org/archive/id/draft-smith-quic-receive-ts-00.txt and
googcc to an impl of QUIC in Rust, let me know. I'm interested in making
that work.
12:10:30
Joerg Ott
Peter: we implemented quite a bit of this, and it seemed to work. But
the details of QUIC CC interaction remain.
12:10:43
Let's chat offline about this (I will have to disappear shortly for
boarding my flight)
12:11:22
Vidhi Goel
I can work on congestion control issues
12:13:29
Peter Thatcher
Joerg: I'm interested to see how far you got
12:13:50
If anyone wants to chat QUIC CC offline, send me an email
(pthatcher@microsoft.com)
12:15:25
Joerg Ott
We'll reach out.
12:15:44