Audio/Video Transport Core Maintenance (avtcore) Working Group

CHAIRS: Jonathan Lennox
Bernard Aboba

IETF 115 Agenda
Date: Tuesday, November 8, 2022
Time: 09:30 - 11:30 London Time
Session I, Mezzanine 10-11

Meeting link: https://wws.conf.meetecho.com/conference/?group=avtcore

Notes: https://notes.ietf.org/notes-ietf-115-avtcore

Slides:
https://docs.google.com/presentation/d/1BIcHr7XF03vn81DTRRv4-DQv5-niaVCGCi2ZPNqSwwc/


  1. Preliminaries (Chairs, 20 min)
    Note Well, Note Takers, Agenda Bashing, Draft status, CfAs

    Notetaker: Magnus Westerlund

    CfA on "Game State over RTP"
    (draft-jennings-dispatch-game-state-over-rtp)

    Jonathan Lennox: The "Call for Adoption" of "Game State over RTP"
    completed on May 8, 2022. There were only two responses to the CfA,
    one from Suhas (in favor) and one from Stephan Wenger, who was in
    favor of adopting the RTP payload format part of the spec, but not
    the game states (the majority of the document). At IETF 114, the
    Chairs proposed potentially extending the CfA, if the authors have a
    proposal to develop a community to move the work forward in IETF. Do
    the authors want to continue?

    Cullen Jennings: Defining the RTP format in AVTCORE is possible
    under the Charter, but apparently there is not enough interest. I
    agree that only two respondents is not enough.
    David Schinazi: Why is an IETF RFC needed?
    Cullen: We need an IANA registry to enable registering future
    additional game states.
    Stephan Wenger: I don't think that defining game state formats is
    the IETF's business. This WG doesn't have the expertise in games to
    review those formats. This is more like a codec format and that
    should be done in a body that has interest in this type of work,
    like AoMedia or MPEG.
    Cullen: The IETF has done Opus, VP8 and VP9, which bodies such as
    MPEG did not like.
    Stephan: I haven't seen any effort in MPEG or AoMedia to get them to
    accept the game state format.
    Bernard Aboba: The W3C has defined a game pad API and has had
    game-related workshops. So perhaps there is expertise there?
    Harald Alvestrand: AVTCORE is the gatekeeper for registering the RTP
    payload format. So it should lead, follow or get out of the way.
    Stephan Wenger: There is a difference between the RTP payload format
    and the game state format. The AVTCORE WG does not define codecs,
    this is done by others. I did not object to defining the RTP payload
    format in AVTCORE; the WG has the competence to review that.
    Magnus: The RTP payload format is AVTCORE's chartered business, but
    going through a WG is not the only way of publishing and registering
    an RTP payload format. That could be done via an Independent
    Submission.
    David Schinazi: The IANA registries are open. An alternative to
    handling the work in the WG would be to publish the document in the
    independent stream.
    Bernard Aboba: What would the registry policy be?
    Cullen: a small number of code points would be Expert Review (with
    specification required) and larger spaces would be first come, first
    served.
    Jonathan: Does anyone object to taking this document to the
    Independent Stream? No objection.

    Action: The AVTCORE WG recommends that the document be taken to the
    ISE.

    CfA of RTP Payload format for V3C.

    Jonathan: The CFA completed on October 31, 2022. There were five
    responses, all positive.
    There appears to be WG consensus for adopting the document. Any
    objections?
    No objections.
    Action: The AVTCORE WG adopts "RTP Payload Format for V3C". The
    authors should re-submit the
    document as draft-ietf-avtcore-rtp-v3c.

    CfA on “RTP Control Protocol (RTCP) Messages for Green Metadata”
    Jonathan: This CfA is ongoing, due to complete on November 30,
    2022. Please respond to the CfA on the list.

  2. RTP Payload Format for the SCIP Codec (D. Hanson, 10 min)
    https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-scip

    Dan Hanson: We submitted -03 in October 2022, and have contacted the
    reviewers in SECDIR, GENART and ARTART to
    verify that their review comments have been satisfactorily
    addressed. One reviewer asked whether the reference to SCIP210
    should be normative.

    Cullen: The reference should remain informative, and that should be
    the general guidance for RTP payload format authors. That is
    important. We need to distinguish IPR declarations relating to the
    RTP Payload specification from declarations relating to the codec.

    Stephan Wenger: The IETF seems to be getting more and more
    bureaucratic.
    Bernard: Yes, that is a concern. The authors have messaged the
    reviewers, informing them of the draft updates. Did they respond?
    Dan Hanson: No.
    Bernard: You have done your part. We should have enough information
    to move forward on the Publication Request.

  3. RTP over QUIC Sandox (B. Aboba, 10 min)
    https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-over-quic

    At the virtual interim, we presented an experimental implementation
    of an "RTP-ish" packet format using the WebCodecs and WebTransport
    API, using only Javascript, no WASM. Latency was good at small
    resolutions (QVGA/VGA), but at high resolutions (hd, full-hd) there
    was a visible lag. For example, for AV1 encoding of full-hd at 30
    fpbs, with a 1 Mbps average target bitrate, glass-glass latency was
    measured at 630 ms when the frame RTT was 100 ms. Other curious
    observations: re-ordering was not observed on the receiver, the
    observed bandwidth consumption was considerably lower than the
    average target bitrate.

    We investigated a number of potential causes and believe we have
    found the culprit. In the WebTransport API, the writer.write(chunk)
    promise completes when the chunk is handed off to the QUIC send
    queue. As a result, await'ing the write promise causes only one
    frame to be sent at a time, so that I-frames and P-frames cannot be
    sent concurrently. In effect, this re-introduces head-of-line
    blocking, even when each frame is sent on its own QUIC stream.

    If instead of await writer.write(), we use await writer.ready, then
    writer.write(chunk) and writer.close() the WHATWG streams buffers
    are kept full and concurrent sending is restored. The glass-glass
    latency decreases markedly, and re-ordering is now observed on the
    receiver (e.g. P-frame(s) are received before the first I-frame).
    Also, bandwidth consumption is much closer (and slightly larger)
    than the average target bitrate.

    Another interesting effect is that the excess I-frame RTT observed
    formerly seems to have decreased. Instead of the I-frame RTT being
    2-3 times the P-frame RTTmin, it is now closer to the transmission
    line. One potential explanation might be that the congestion window
    in the "after" plot is large enough to allow an I-frame to be sent
    in a single RTT. In these plots (AV1 at full-hd, 30 fpbs, 300Kbps
    average target bitrate), the I-frame is of modest size (< 12 KB).

    A good article on the performance pitfalls of Javascript async await
    is here:
    https://www.learnwithjason.dev/blog/keep-async-await-from-blocking-execution

    A version incorporating concurrent sending and "bring your own
    buffer reads" can be found here:
    https://webrtc.internaut.com/wc/wtSender7/

  4. SFrame and RTP over QUIC (P. Thatcher, 15 min)
    https://github.com/mengelbart/rtp-over-quic-draft/issues/29
    https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-over-quic

    Peter Thatcher: Issue 29 in the RTP over QUIC Github repo relates to
    SFrame. With RTP over QUIC, it is possible to encapsulate an entire
    SFrame and send this over a QUIC reliable stream. But what happens
    if you have some participants in a conference that support RTP over
    QUIC, and others that support only RTP over UDP? The conference
    server may then need to convert from large to small.
    I-D.draft-ietf-avtcore-rtp-over-quic-01 Section 4.1 says that it
    “may need codec-specific knowledge to packetize the payload of the
    incoming RTP packets in smaller RTP packets.”

    What does "codec specific knowledge" mean in the context of SFrame
    where only the endpoints have the key? With respect to a middle box
    (to which SFrames are opaque), "may need codec specific knowledge"
    might really mean “may need SFrame-specific knowledge”. SPacket has
    the same problem, because there can still be a need to re-packetize.
    The problem can also occur even if RTP is transported in QUIC
    datagrams because of differences in overhead compared with RTP over
    UDP.

    Summary of the problem:

    1. RTP over QUIC allows for big RTP packets
    2. MTU differences require re-packetizing
    3. Re-packetizing is “codec-specific” (sframe-specific)
    4. Problem cannot be solved purely on the endpoints

To solve the problem, there needs to be a way to indicate how an SFrame
is split between packets, to allow the SFrame to be re-assembled when
the packets arrive at the endpoint. One potential solution to the
problem is to prepend an SFrame sequence number and SFrame chunk index
at the beginning of the RTP payload, after which the SFrame follows.

Peter shows an example.

Magnus Westerlund: It appears to be reasonable scheme, but it needs a
last fragment indicator.
Peter: Yes. One approach I looked at was to utilize the marker bit.
Jörg Ott: How is the generic problem solved for RTP over UDP? The same
problem can occur in translator, correct? It isn't discussed in the
topologies draft. [It is mentioned in RFC 7667 Section 3.2.1.2,
Topo-Trn-Translator]
Bernard Aboba: Conference servers (which often have codec-specific
knowledge) can re-packetize.
Mo Zanaty: How does this deal with authentication and the goal of having
only one authentication tag for the whole?
Peter: SFrame is end-to-end. A middle box can re-assemble the whole
SFrame if it needs to.
Harald Alvestrand: SFrame is end-to-end. SRTP handles hop-by-hop
encryption and authentication. You should not require SFrames to be
verified by middle boxes. That is over-constraining.
Sergio Garcia Murillo: You are trying to solve the middlebox issue
without having resolved how SPacket would work in general. We should
solve how SFrame over RTP works first.
Peter: This proposal can also be the basis for the SFrame over RTP
packet format. Today there is no way for an SFrame to be split up into
chunks and be re-assembled. That is a problem whether the endpoint is
packetizing (SPacket) or a middle box is re-packetizing.
David Schinazi: QUIC datagram enthusiast. The fundamental problem is
that you have MTU issues. The solution is a clone of the IPv4 fragment
and we know that is inefficient.
Bernard Aboba: The goal of RTP re-packetization is to avoid IP
fragmentation.
David Schinazi: Why can't you do MTU discovery?
Cullen Jennings: In a conference, participants can come and go. An
endpoint sending RTP does not track the MTU of each participant and it
cannot do MTU discovery to each of the (many) participants. However, one
solution might be for RTP senders to use the minimum MTU (e.g. 1200
bytes).
Peter Thatcher: The use case is a mixed conference where some (most?)
participants are using RTP over QUIC, but there also might be some RTP
over UDP participants.
Jonathan Lennox: This an issue for SFrame over RTP to solve. It is a
general issue that applies not only to RTP over QUIC.
Richard Barnes: The direction that SFrame WG is going is to only specify
the SFrame. There are a lot of ambiguity about how the relation between
RTP and SFrame. There should be some possibility for datagram
packetization layer path MTU discovery similar to what is done in
transport protocol.
Cullen Jennings: MTU discovery doesn't work for RTP conferences.
Jonathan Lennox: We need an SFrame over RTP draft. I don't care if this
is done in AVTCORE or SFRAME WG. Richard noted that SFrame WG is focused
on getting the encryption done.
Sergio: SFrame over RTP was rejected by the WG.
Bernard Aboba: There were suggestions that endpoints do packetization
first, then encryption. You had proposed that the endpoints encrypt
first, then packetize.
Jonathan Lennox: There are RTP payload specifications that allow for
delivery of slices. Packetizing first, then encrypting allows those
codecs to packetize as designed. Encrypting and then packetizing does
not.
Mo Zanaty: SFrame needs to support being fragmented to what the lower
layer can handle. This is basic to any transport. I'd argue that
SFrame needs to support fragmentation natively.
Spencer Dawkins: We should be looking at RTP topologies and what
features we need to support that, in order for SFrame to function.
Magnus: RTP over QUIC should not specify how to fragment SFrame. SFrame
fragmentation is needed either natively in SFrame or in the SFrame RTP
payload format. I think that the SFrame RTP Payload Format should be
done in the AVTCORE WG.
Sergio: There has been little progress on the SFrame RTP Payload format.

Peter: Sounds like people do not believe that the proposal should be
submitted as a PR to RTP over QUIC. Should I submit a draft for the
"SFrame RTP Payload Format"?
Jonathan Lennox: There have been some RTP issues that have not be
resolved because of the lack of dedicated side meetings with the right
people. Also, with RTP over QUIC streams allowing for an unlimited MTU,
it forces RTP middle boxes to re-packetize to a greater extent than what
has been done before. Therefore this issues does appear to be something
that RTP over QUIC needs to discuss.

Action: A side meeting will be scheduled to discuss these issues, as
well as a virtual interim.

  1. RTP over QUIC (J. Ott, M. Engelbart, 20 min)
    https://datatracker.ietf.org/doc/html/draft-ietf-avtcore-rtp-over-quic

    Issue 45: The addition of a length field allows multiple RTP packets
    to be sent on the same QUIC stream. However, this makes it difficult
    for the receiver to cancel an incoming stream.
    Bernard: Not sure that preventing a receiver from cancelling
    streams is a problem. The sender has more information on what it is
    sending and the appropriate timeout to set. For example, the sender
    can set a higher timeout for a key frame or base layer frame than an
    SVC extension layer frame that is discardable.
    David Schinazi: When the receiver sends a STOP_SENDING frame, this
    is to ask the sender to stop and reset the stream. The sender can
    ensure that it stops on a frame boundary. There might be
    implementation issues.
    Lucas: There may be implementation bugs. What is safe here?
    Jörg: STOP_SENDING unaligned will only result in loss of the RTP
    packet that is being received.
    Cullen Jennings: (after discussion) I prefer option A (accept that a
    receiver can't cancel streams). The receiver doesn't know how much
    data is in flight or will be lost. Sending a STOP_SENDING frame
    will save some capacity.
    David Schinazi: I apologize for being too clever. We should have
    called QUIC Streams "Messages" instead. Doing B (Require only one
    frame per stream) will have some advantages.
    Piers O'Hanlon: Sending over datagrams will not have this issue.
    Jörg: Support for reliable stream transport has been expressed. It
    is considered desirable.
    Magnus: Proposal B will limit the session lifetime to a long but
    finite session length. This should be noted in the document if it is
    used. I support option B.

  2. SDP for RTP over QUIC (S. Dawkins, 15 min)
    https://datatracker.ietf.org/doc/html/draft-dawkins-sdp-rtp-quic

    What AVP profiles to register?
    Would like to conclude on what direction to use here.
    Asking for Call for adoption.
    Cullen Jennings: I don't think we are ready for a Call for Adoption
    yet.

  3. Wrapup and Next Steps (Chairs, 10 min)
    Jonathan: Apologies for running out of time.
    Chairs to set up a side meeting on SFrame over RTP as well as
    scheduling a Virtual Interim.