Skip to main content

Minutes interim-2021-avtcore-01: Thu 09:30
minutes-interim-2021-avtcore-01-202101280930-00

Meeting Minutes Audio/Video Transport Core Maintenance (avtcore) WG
Date and time 2021-01-28 17:30
Title Minutes interim-2021-avtcore-01: Thu 09:30
State Active
Other versions plain text
Last updated 2021-01-28

minutes-interim-2021-avtcore-01-202101280930-00
Minutes of the AVTCORE WG Virtual Interim Meeting
January 28, 2021
------------------------------------

### Draft Status

Bernard to do IESG writeup of RTT multiplexing draft
Jonathan to follow up on status of Tetra codec payload draft
Cryptex needs to be submitted as WG draft

### JPEG XS Payload Format
Next steps:
1) new WGLC
2) Reviews by Jonathan and Bernard (perhaps others?)

### Framemarking Last Call
(from slide) Stephan Wenger: do we still think this is still a good idea, i.e.,
will it work in practice? Do we need to require future codec specifications to
include support for it? (from slide) Sergio Garcia: we have identified some
issues with VP8/9. Mo Zanaty: The P/U bit issue triggered a recent update to
the draft (re: temporal nesting) Bernard: Aren't all the temporal scalability
modes nested (e.g. L1T2, L1T3, etc.)? Jonathan: All the useful ones. Bernard:
So the P/U bit issue is only relevant to non-nested spatial scalability modes
(e.g. VP9)? Jonathan: Vidyo used framemarking to support introduce the Temporal
ID field that was not present in H.264/AVC, so that temporal scalability could
be supported in WebRTC.org (to which a patch was contributed). Bernard: This
allowed for H.264/AVC implementations in WebRTC to support temporal scalability
without having to implement H.264/SVC (a separate codec with additional NAL
units). Sergio: Had to do some remapping for VP9, and K-SVC was not supported.
Bernard: VP8 implementations require the PictureIDs to be consecutive (this is
not the case for VP9). So if an SFU drops frames, it has to rewrite the
PictureID and TL0PICIDX fields, or else the decoder might not be able to handle
it. This means that an SFU supporting VP8 MUST be able to parse (and modify) at
least portions of the VP8 payload. Jonathan: This is an issue with VP8 decoders
that could in principle be fixed. Justin: From a Chrome perspective, although I
forget the details, I think SFRAME's marking (the Generic Frame Descriptor) was
intended to replace framemarking. Bernard: To be clear, the VP8
PictureID/TL0PICIDX issue will afflict *any* RTP header extension used for
forwarding. So a GFD, or Dependency Descriptor implementation will still need
to parse and modify the VP8 payload. So if the goal is to enable "codec
agnostic forwarding without parsing the codec payload" it would seem
unachievable (at least without patching VP8 implementations) Mo: I don't think
there's anything codec-specific in framemarking that causes a problem. Really a
question of whether you have a complex/dynamic scalability structure or not.
Framemarking was only designed for simple/static ones, and it can do that for
any codec. To Stephan's point, if you are doing something tricky, you really
need to expose the codec bytes directly rather than use this sort of generic
descriptor. But still would be nice to have something rather than nothing.
Sergio: There's still a lot of codec-specific work to be done with
framemarking, so that's a problem for implementors. (referring to earlier
comments about VP8). Bernard: Jonathan has had some ideas about how to improve
VP8 to deal with this issue, but ultimately there are problems when you want to
do e2e encryption; because of the VP8 issue you cannot encrypt and integrity
protect the entire payload. Mo: The goal here was to allow receivers to get
metadata without having to dig into the payload. But the sender always needs to
understand the metadata so that it can construct the framemarking header.
Jonathan: Do we want to abandon framemarking, or just the part that all
payloads need to have a section indicating how they use this mapping? Bernard:
Maybe we can get people to weigh in on this. Document is still useful and
valuable if only to document what was done, and what we learned from it.
Opinions on what to do next? Mo: I would tend to side with use rather than
anything else. If people want to use this, or something like this, let's
proceed. If nobody wants this, then no. Still there's a lot of activity in AV1
DD and SFRAME indicating there is interest here. Bernard: These follow-on
efforts do seem to be encountering some of the same issues identified here. Not
sure that those new efforts can solve what we couldn't solve here. Mo: New
things may still come up in VVC or elsewhere. Bernard: As Stefan noted, SFU
devs don't want to have to parse every codec. Being able to easily support a
new codec has agility value, even if every feature of the new codec may not be
supportable (or there is some loss of optimality). Is there value in geting a
simple solution that covers most cases? Sergio: Use cases drive implementation.
VP9 K-SVC is used in webrtc.org screen sharing. If Framemarkimg cannot support
VP9 or AV1 K-SVC or VP8, support for temporal scalability in H.264/AVC is not
enough to justify keeping the code around, because you will need an alternative
that can achieve wider coverage. Magnus Westerlund: Each bit of information
needs to be evaluated for its privacy impact. We really need SFRAME users to
think about how they package their metadata. Generalizing as much as possible
still seems useful. Something to think about here. Bernard: Do you have a
specific opinion here? Magnus: If nobody is implementing this, maybe we need to
skip ahead to v2. We could publish this as informational. Justin: Definitely
value in a generic mechanism if people feel this would cover 90% of cases.
There will always be custom cases where you need the SFU to parse raw metadata,
need to figure this out in the SFRAME context. Real questino is whether this
mechanism covers 90% or 10%. Would be useful to get feedback from Google
implementors. Sergio: Don't think framemarking is useful at this stage. Not
sure exactly whether releasing as informational or something else is best. Tim
Panton: Defining the bits can be more difficult than actually generating them.
Doesn't answer question of what to do with the document. Mo: To the 10% or 90%
question, the authors naturally feel that this covers 90%+ of "common meeting
services". Shouldn't restrict implementations from going beyond. We could say
how one cold go beyond if desired. Any proposal will have to have many of the
same elements - start of frame, end of frame, etc. Sergio: I disagree with the
authors on the coverage here. For example, K-SVC not being supported means it
isn't really covering 80% of cases because you can't do screen sharing, which
is a common use case. Mo: WG didn't consider VP9 because WG didn't want to try
to cover draft payload specs. Jonathan: The issue is that framemarking doesn't
give SFU enough information to know what to forward for spatial or K-SVC modes.
If you need to have framemarking and something else, then framemarking isn't
terribly useful. Bernard: Next steps? Hums? Roni Even: We should have a hum
about whether future codecs will need to support this. Jonathan: Let's do the
first hum first. 6 in favor of publishing. 2 in favor of not publishing.
Jonathan: Weak consensus to publish. 2nd hum: should we publish as proposed
standard? Magnus: Hard to answer this question right now. Do we have something
better? Jonathan: Hard to know right now, i.e., AV1 DD is still being
implemented. Bernard: So we don't know yet whether it can achieve wider codec
coverage, or will face similar limitations to framemarking (likely with respect
to VP8) Magnus: Perhaps publish as experimental? Jonathan: I had been thinking
informational, but experimental could also work. Bernard: Let's hum for PS. 1
in favor of PS. 7 for something other than PS. Jonathan: Should we get into the
details on Exp vs Info? Mo: Are we discussing the current doc, or an updated
version of the doc, or? Jonathan: Does anyone want to speak to this? Jonathan:
We could rewrite this to be the AV1 DD, in an extreme case. Bernard: There's
value in documenting what we have. Doesn't make sense to try to fix every
issue, that would turn this into something else. Document what it does and
doesn't do and leave it at that. Magnus: I think that's the right answer. If
there are easy fixes, do that. Otherwise send it out as experimental. Don't
hold it up too long. We can replace later if we come up with something.
Bernard: We'll bring this to the list, but seems like we have good guidance.

### VP9 Payload
Jonathan: Only open issue here is regarding framemarking. Shall we just move
this section to the framemarking draft and publish? 4 +1s in the chat for this.
Jonathan: That's what we'll do.

### SFrame
Youenn Fablet: Sergio and I looking into how to support SFrame. Breaks
assumptions for packetizer. Not always tractable to handle this. Is there
another approach we can use here? May have some other benefits. 3 main parts:
processing model, generic packetization, negotiation. Youenn: Processing layer
can split a logical frame into multiple sub-frames with metadata, each of which
will be packetized independently. Youenn: Once we have subframes, need to send
them. Opaque data, packetizer just splits if needed to fit in MTU. Meta sent as
RTP header. Very similar to framemarking. Jonathan: This makes life easier for
the packetizer, but does it make life hard for the depacketizer? Youenn:
Depacketizer processing is very simple. Aggregate split frames and pass to
decoder. Sergio: Some complexity on what a frame is. Let's not discuss what
comes into the depacketizer but rather how it will work. Magnus: Quite
reasonable to discuss this. This payload format could be misused. This goes
from basic payload format to something more complicated. Colin Perkins: Not
clear how much of RTP is left once you do this. All the effort in payload
formats, does that go away? Seems like a startlingly large change. Sergio:
There was agreement we were going to work on this. We're only looking at
codec-specific packetization here. NACK, FEC, still the same. Colin: This is a
fundamental feature of RTP. Sergio: This is what SFrame is trying to do. Colin:
This isn't using 90% of RTP features. Why not make a new thing? Youenn: This
reuses lots of RTP - headers, etc. We're giving up some of the benefits of
codec-specific packetization, but we think that's not that much. Would be good
to hear the specific issues. File in Github would be nice. Colin: We don't need
to go to Github to do this. Jonathan: Just get them in writing. Youenn: Let's
get precise issues. Colin: You're not using RTP features here. Youenn: I don't
understand that. Justin: I think we are using a lot of RTP functionality -
headers, encryption, recovery, SSRCs, etc. So let's get the issues identified
since making a new thing would have way more unknowns. Magnus: I think you're
talking past each other somewhat. It will become clearer how you're
reimplementing existing mechanisms. Sergio: Will be doing update of document
that will make this all clear (prior to 110) Mo: This is similar to red or FEC
- those are wrappers, and this is a privacy wrapper. If we can define how RTP
bits work with this transform this may be what we need. Youenn: We have some
more slides that speak to this. Colin: What Mo is suggesting may match what is
proposed, but I'm not getting that. Need a clearer architectural description,
and we're talking past each other right now. Bernard: Time check. Jonathan: We
need more overview, let's ask the authors to provide more of this. Sergio: But
we did this in the last IETF meeting. Youenn: Not sure exactly what is needed
and next steps. Bernard: Next step is to have a draft. Colin: And it needs to
describe how it fits with the rest of RTP. The authors think they have
explained it but there seem to be a lot of assumptions that are not written
down anywhere. Jonathan: Let's get an email with Colin and Magnus. We need to
clarify the clarifications. Sergio: We need to understand the doubts. Colin:
How does this fit into the RTP architecture? DrAlex: I have an idea, let's take
this offline. WG

### QUIC RTP Tunneling
Spencer: The filename indicates this draft is targeted to QUIC WG, but if this
work is in scope for AVTCORE, wouldn't it get more attention here if it was
-avtcore-? Jonathan: Probably, but may need some charter tweaking. Spencer:
Seems like an excellent time to have that conversation. Magnus: QUIC WG is
mostly for extensions. Protocol mappings should happen elsewhere, e.g., here.
Sam: No QUIC extensions needed here. Will repost. Justin: We've looked at this
at length over the past 5 years and one of the fundamental problems is the
mismatch between QUIC congestion control and RTP congestion control (which are
tuned differently for different purposes) Sam: Want to experiment with this.
Sam: Interested in any feedback on this document. Jana (in chat): Justin's
point is important. Please bring this to the QUIC wg, or at least bring this to
the attention of the QUIC wg, and engage there as well. There are many
interactions here. Bernard: Please also note that we have both QUIC datagrams
and HTTP/3 datagrams. WebTransport WG is going to be focusing on
Http3Transport, so there will probably be an additional multiplexing field in
the datagram. WebTransport has also discussed the prioritization issue.