Skip to main content

Minutes interim-2023-avtcore-01: Thu 16:00
minutes-interim-2023-avtcore-01-202302231600-00

Meeting Minutes Audio/Video Transport Core Maintenance (avtcore) WG
Date and time 2023-02-23 16:00
Title Minutes interim-2023-avtcore-01: Thu 16:00
State Active
Other versions markdown
Last updated 2023-02-24

minutes-interim-2023-avtcore-01-202302231600-00

Audio/Video Transport Core Maintenance (avtcore) Working Group

CHAIRS: Jonathan Lennox
Bernard Aboba

Virtual Interim Agenda
Date: Thursday, February 23, 2023
Time: 08:00 - 10:00 Pacific Time

Notes:
https://notes.ietf.org/s/notes-ietf-interim-2023-avtcore-01-avtcore

Meeting link:
https://datatracker.ietf.org/meeting/interim-2023-avtcore-01/session/avtcore

Remote instructions:
https://meetings.conf.meetecho.com/interim/?short=4da9b4c1-62cd-458e-8a8f-58fa91cf8ce4

Slides:
https://docs.google.com/presentation/d/1QAo7WiUmIfKWYp5ntV37aYvtnSuyZ0gNqrcrnQS4KkM/


  1. Preliminaries (Chairs, 15 min)
    Note Well, Note Takers, Agenda Bashing, Draft status

  2. RTP Payload Format for SCIP (D. Hanson, 15 min)
    https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-scip

  3. RTP Control Protocol (RTCP) Messages for Green Metadata (Yong He, 15
    min)
    https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtcp-green-metadata

  4. RTP over QUIC (J. Ott, M. Engelbart, S. Dawkins, 25 min)
    https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-over-quic

  5. Viewport and Region-of-Interest-Dependent Delivery of Visual
    Volumetric Media (S. Gudumasu, 10 min)
    https://datatracker.ietf.org/doc/html/draft-gudumasu-avtcore-rtp-volumetric-media-roi

  6. Wrapup and Next Steps (Chairs, 15 min)


Draft status updates

  • VP9 payload is in MISSREF, waiting on framemarking draft.

RTP Payload Format for SCIP (Dan Hanson, Mike Faller)

IESG Ballot has completed, 3 DISCUSS comments.

  • In SCIP packetization and rate control is handled by the underlying
    codec. If the underlying codec (e.g. G.729) does not have the
    ability to control rate, then the ability of SCIP to respond to
    congestion will be limited.
  • SCIP supports both control and data traffic. Control traffic handles
    key management as well as negotiation. Since control traffic has
    structure as defined in the SCIP documents and includes a length
    field, so that SCIP control messages can be split between RTP
    packets. Data traffic consists of audio/video as well as chat and
    whiteboard, is encrypted and therefore appears to RTP as an opaque
    blob that can be split between RTP packets. This means that the SCIP
    stream can appear differently depending on the state of SCIP and
    whether the traffic represents control or data.
  • SCIP authors are looking for WG review of the proposed changes
    before publishing a revision 5.
  • Bernard Aboba: Would be useful to take a look at the ballot
    positions. The IESG is looking for a standard RTP payload type
    document, with sections on the RTP packet format, RTCP feedback,
    etc. but they didn't get that.
  • BA: The SCIP authors appear to have responded to Franchesca's review
    comments relating to change control. Roman is asking if the traffic
    is entirely opaque - once you've exchanged the keys, it is entirely
    opaque, correct?
  • Dan Hanson: Yes. But the control traffic has structure, as described
    in SCIP-210.
  • BA: While you don't have to go into detail on the key establishment
    process, I do think you need to describe how control traffic and
    data traffic are packetized. Once keys are negotiated, the data
    traffic is entirely opaque. Packetization is handled by the
    underlying codec, which hands the payload up to SCIP for encryption
    and placement in the payload field. So if the codec is H.264, an
    H.264 packetizer is providing a payload in RFC 6184 format to SCIP
    for encryption.
  • BA: You may need to say a little bit about how the SCIP are
    fragmented into RTP packets.
  • DH: The messages have their own header with message length.
  • BA: So you will need to describe how the SCIP control messages are
    split between packets (e.g. how fragmentation works) as well as how
    the fragments are de-packetized.
  • BA: There is another comment from Roman about secure session
    establishment protocol behaviour. I'd suggest talking about the
    basics, but I don't think you need to reproduce the SCIP state
    machine, because you just need enough to describe how packetization
    and de-packetization works.
  • BA: Sarker's comment asks what RTP profile should be used. Also need
    to specify exactly what "highly variable" means.
  • Jonathan Lennox: The RTP profile will depend on the underlying
    codec.
  • BA: However, SDP does not reflect that, because the negotiation is
    handled within SCIP.
  • Mike Faller: We've been throwing the word "codec" around a lot, but
    that underlying codec could be a chat session, so it's better to
    think of it as data being carried by SCIP.
  • BA: Yes, since the payload can represent media or chat or
    whiteboard, data is probably a better term.
  • BA: This seems like a question that a description of the overall
    architecture and layering might address.
  • MF: What do you need to know about SCIP to implement it?
  • BA: Since negotiation of the underlying codec is handled by SCIP,
    the RTP profile does not need to be negotiated in SDP. Also, that is
    why the traffic is "highly variable".
  • MF: One of the fears from networking equipment vendors is wanting to
    know how to filter on SCIP, but you can't as it's so variable and
    changing.
  • Jonathan Lennox: Need a statement in the document to just not try.
  • BA: QUIC deals with this in the "ossification" section - you could
    add a section here telling implementers to not bother with deep
    packet inspection because the traffic can vary depending on whether
    it is data or control traffic. Attempting to parse opaque data is
    pointless.

RTP Control Protocol (RTCP) Messages for Green Metadata (Yong He)

  • Jonathan Lennox: Question from Nokia - if there was any negotiation
    of the resolution in SDP, how does that interact with these
    messages? Probably that you should never go above the resolution
    which SDP allows.
  • JL: Question from Magnus Westerland about the format
  • JL: Better to call the document "Temporal-Spatial Resolution" as the
    Green Metadata effort within MPEG is more than just TSR.

RTP over QUIC (Mathis Engelbart, Spencer Dawkins)

  • Mathis Engelbart: Had feedback that the congestion control section
    needed to describe what kind of congestion control was required, and
    which layer should perform congestion control.

Spencer's summary of controversial ideas in congestion control:

  1. "Disabling QUIC CC" - obviously, any implementation can do
    anything at its end, but there's not a defined way to tell the OTHER
    end to do that. So, how do we make sure we're talking to an "other
    end" that will Do The Right Thing? Port numbers? ALPN (as is in the
    draft today)? Just assume the RTP sender will never cause QUIC CC to
    kick in, because it's interactive media, and think happy thoughts?
    Other ideas?
  2. Conforming to RFC 8085 (because we're running over UDP with little
    or no congestion control happening there).
  3. Is there a MTI rate adaptation algorithm for RTP-over-QUIC? -
    Spencer doesn't think so, because NADA and SCREAM are IRTF-stream
    algorithms. For some value of "we", "we" could talk to the authors
    about how to bring this through the IETF. Are we going to do that?
    Spencer also notes that Christian Huitema has a proposal for
    media-aware CC in MOQ now, so it may be too soon to pick an
    algorithm now.*
  • Spencer Dawkins: Been talking about disabling QUIC CC for years. Any
    implementation can do whatever it wants at its end, but no way for
    the peer to be told to do the right thing. Could be in ALPN, but
    some people don't like that.
  • SD: What would we need to do to conform to RFC 8085 as we're running
    over UDP, and turning off QUIC CC would require understanding of
    what the app layer will do
  • Bernard Aboba: In QUIC, both sides don't need to use the same
    congestion control algorithm. The CC algorithm isn't negotiated, and
    yet the two sides can interoperate. On disabling QUIC CC, is this a
    serious thing that people want to do? We've talked in various places
    about trying to choose CCs that would work. If you are compiling a
    QUIC library within your application, you could remove the QUIC CC
    algorithm and handle CC in the application instead, but it's highly
    unlikely that a browser would allow an application to disable QUIC
    CC. As an example, in WebTransport, there is no constructor argument
    for "no congestion control".
  • SD: There's two layers to this - using the language of disabling
    QUIC CC, but how do all the entities in an implementation know who
    is doing congestion control, and how to know that's not appropriate
    for H3? I.e. don't use BBR or other CCs that do bandwidth probing.
  • BA: In the case you've just mentioned, it's possible for one peer to
    use BBR and the other to use New Reno. The Probe RTT phase would
    destabilize rate control (the application sees loss or increased
    delay, so it reduces rate, then probe RTT kicks in, delay goes back
    up and application will reduce rate even more though it is not
    necessary). So rather than worrying about CC negotiation, the draft
    should instead describe what algorithms are likely to provide good
    results.
  • SD: Should we have a mandatory to implement rate adaptation
    algorithm? Spencer doesn't think so, but wondering if others agree?
  • BA: Rate adaptation depends on what mechanisms the encoder provides.
    The encoder may support scalable video coding, which enables
    dropping or adding of layers, or it may support per-frame or
    per-macroblock QP control. Or the encoder might just allow changes
    in resolution or framerate. So a given rate control algorithm may
    not be implementable if the encoder doesn't offer the required
    controls.
  • SD: A lot of the work in RMCAT (NADA, SCReAM) was done on how to not
    break other flows of NADA or SCReAM, but not necessarily what would
    break other congestion controllers.
  • Peter Thatcher: I don't think this is a question of whether we're
    turning off QUIC CC - what is the feedback it's going to use to
    implement feedback to RTP - is that going to be feedback from QUIC
    or from RTP? Are we going to extend QUIC to have the timestamps
    necessary, or is it going to be feedback embedded in streams or
    datagrams. How implementations use that feedback should be up to
    that implementation.
  • SD: Agree with Peter. I want to improve the quality of discussions
    by having better descriptions on issues on github. We haven't had a
    lot of conversation that focuses on all three options.
  • Jonathan Lennox: Agree that whether you have congestion control at
    the RTP level - I think the desision of how much is available to
    send can happen at either end, but the decision of what to send
    needs to be decided by the sender only and this needs to be
    clarified. My expectation is that a CC like New Reno or CUBIC, and
    if you run a rate-based CC like GCC or NADA on top of that is that
    they'd work together well, but not with BBR. The issue is
    communication between the layers.
  • BA: Key frames cause the most issues because they are so much larger
    than everything else. Average Bitrate Targets are just averages, but
    since keyframes are 10+ times larger than P-frames, keyframes are
    much more susceptible to loss and queuing delays. Also, when sent
    over QUIC reliable transport, the size of the congestion window is
    useful to know, since keyframes may require multiple roundtrips to
    send. For exampe, at high resolutions, the keyframe might be 150 KB
    and if the cwind is 15 KB (10 packets), that implies 10 roundtrips!
    This can dictate the startup latency, or the length of a video
    freeze that will be experienced in event that a keyframe is needed
    to switch between streams or to recover from loss. W3C WebTransport
    WG has been working on metrics to provide to the application.
  • SD: Going to have an arms race between people wanting to send more
    and those wanting better compression. The very biggest frames that
    we send are getting bigger over time, until someone comes up with a
    way to amend that.
  • BA: Better compression means you can either keep the quality the
    same and save bits, or use the same number of bits but do more with
    it. In many applications (such as 4K gaming or AR/VR) the
    inclination will be to provide higher quality, so the bitrate won't
    go down.
  • SD: Mathis and Joerg have already done some due dilligence with QUIC
    CC, and noted it wasn't kicking in on some representative RTP
    traffic. It didn't, but we need to keep an eye open in case that
    starts happening.
  • BA: The amount of concurrency is an issue. If you're sending the key
    frame at the same time as P-frames, this allows the pipe to stay
    full even if the keyframe is experiencing loss and retransmission.
    While this may drag out the length of time needed to send a key
    frame, it can decrease glass-glass latency because the pipe is kept
    full, and the cwind grows or recovers more quickly (since it being
    pushed up by the concurrent sending of P-frames).

Viewport and Region-of-Interest-Dependent Delivery of Visual Volumetric Media (Srinivas Gudumasu)

  • Jonathan Lennox: Are you seeking adoption?
  • Srinivas Gudumasu: Not at this time, just looking for feedback.
  • Spencer Dawkins: Is this the rtp-v3c-00 draft?
  • SG: No this is on top of that
  • JL: Might be better for MPEG to define what a region and the
    viewport is - make sure you have the correct experts looking at it.
  • SG: Already working on this.
  • Mo Zanaty: You were talking about coordinates system - does V3C
    describe in terms of meters? Or are they arbitrary and not based on
    real-world sizes? Might be not interoperable when looking at an
    observer's real world dimensions.
  • SG: The sender converts the dimensions and translates them into the
    coordinates for the spatial regions.
  • Mo Zanaty: Needs more wordsmithing to make this more obvious that
    it's images and pixels relative to the media itself.

Action items

  • SCIP authors to add material to the introduction providing an
    overview of SCIP, section on RTP payload format (including
    packetization/de-packetization) answering IESG questions.

Log of Zulip Chat

Sam Hurst
I can try taking notes - will be my first time for an IETF meeting but I
can give it a go

11:05:52
Dale Worley
I'm not hearing any audio, though the browser and Meetecho think I've
got sound enabled and Meetecho reports multiple kbps. I heard the
previous speaker.

11:17:11
Jonathan Lennox
Dale, is that still the case?

11:19:14
Dale Worley
Yes, and toggling the immediately obvious controls doesn't fix it. I am
getting the notifications as to who is speaking.

11:20:12
Ugh, my UI problem. The "Mute Audio" icon in the upper left is "mute
audio to me" whereas the same icon in the lower right is "mute audio I
am generating".

11:22:23
Sam Hurst
My audio glitched out for a moment there - what doesn't matter (so I can
add it to the notes)

11:33:03
Rui Paulo
we can also tell the other side to disable CC via QUIC transport
parameters

11:50:17
Joerg Ott
That would mean that a QUIC library without RTP would be able to signal
that, too, which we may not want to suggest. No CC at all would not be a
good idea.

11:51:10
Peter Thatcher
This isn't really about "disabling QUIC CC". This is more about "the
congestion control for the RTP packets is done using feedback other than
QUIC feedback". Whether it's "in QUIC" or "out of QUIC" is an
implementation detail. The real question for a standard is: what
feedback is being used. Is it QUIC feedback or RTP/RTCP feedback?

11:54:30
Joerg Ott
Right, I am saying only that you should not be allowed to negotiate
turning off QUIC CC inside QUIC transport parameter negotiation

11:55:39
Peter Thatcher
For example, in the WebRTC world, we don't standardize googc. We
standardize transport-wide-cc (or whatever was the result of trying to
standardize it :).

11:56:01
I agree it doesn't make sense to negotiate turning off QUIC CC.

It might make sense to negotiate some kind of QUIC version of REMB,
though.

11:56:58
Harald Alvestrand
well, we failed to find the resources to actually propose transport-cc
for standardization, so that kind of died.....

11:57:10
the proper view is probably that we should have a congestion limitation
as part of QUIC, but a decision on what to send next (or decide not to
send) as part of the application.

11:58:21
Peter Thatcher
Options for feedback:
A. REMB style: have the receiver tell the sender a number. Probably need
a timestamp similar to the RTP abs-send-time header extension attached
to each packet for the receiver to make a good calculation.
B. transport-wide-cc style: have the receiver send per-packet receive
timestamps in ACKs. Let the sender do the calculation.

I think in either case, extending QUIC will be much better than doing
this above QUIC.

12:02:25
I think this works for B:
https://www.ietf.org/archive/id/draft-smith-quic-receive-ts-00.txt

12:03:52
Joerg Ott
B. is what we have in the draft

12:05:15
including quite a bit of discussion on what to inherit from the QUIC
layer

12:05:40
Vidhi Goel
How big is the Pframe?

12:06:12
Joerg Ott
Of course, good ol' RTCP is also an option when you need something that
QUIC doesn't give you (yet)

12:06:29
Peter Thatcher
Joerg: That sounds like a good approach. Why don't we just try and
implement that with, say, googcc, and see if it works well? I'm guessing
it will. What more do we need?

12:07:08
Mo Zanaty
MOQ has a similar need / delimma. That argues for solving this within
the QUIC CC and feedback layer, not above it.

12:07:53
Peter Thatcher
And I'd like googc to work in WebTransport as well.

12:08:14
Mo Zanaty
For receive timestamp extensions to QUIC, there is also Christian's
original proposal:
https://datatracker.ietf.org/doc/draft-huitema-quic-ts/

12:09:10
Peter Thatcher
If anyone is interested in adding support for
https://www.ietf.org/archive/id/draft-smith-quic-receive-ts-00.txt and
googcc to an impl of QUIC in Rust, let me know. I'm interested in making
that work.

12:10:30
Joerg Ott
Peter: we implemented quite a bit of this, and it seemed to work. But
the details of QUIC CC interaction remain.

12:10:43
Let's chat offline about this (I will have to disappear shortly for
boarding my flight)

12:11:22
Vidhi Goel
I can work on congestion control issues

12:13:29
Peter Thatcher
Joerg: I'm interested to see how far you got

12:13:50
If anyone wants to chat QUIC CC offline, send me an email
(pthatcher@microsoft.com)

12:15:25
Joerg Ott
We'll reach out.

12:15:44