Minutes for AVTEXT at IETF-93
minutes-93-avtext-1
Meeting Minutes | Audio/Video Transport Extensions (avtext) WG | |
---|---|---|
Date and time | 2015-07-23 15:40 | |
Title | Minutes for AVTEXT at IETF-93 | |
State | Active | |
Other versions | plain text | |
Last updated | 2015-08-13 |
minutes-93-avtext-1
AVTEXT Audio/Video Transport Extensions Thursday, 24 July, 2015 17:40 - 19:10 CEST (Room: Berlin/Brussels) Chairs: Jonathan Lennox and Magnus Westerlund (filling in for Keith Drage) Responsible Area Director: Ben Campbell Notetakers: Emil Ivov, Varun Singh Agenda bash and status update ===== WG status: grouping taxonomy approved stream pause requires a few changes after AD review splicing notification: ready for pub req, will proceed once pause is advanced Action Item: Ben to confirm that Bo responded to all his AD review items for stream pause, send to IETF last call ========================================== RTP Header Extension for source description Magnus Westerlund, presenting ========================================== Status: Ready for WGLC Action Items: Simon Perrault will review draft Magnus and/or Roni will submit 5285bis to AVTCORE Discussion: Issue: switching between one and two byte header extension format is not clear in RFC 5285 Mo Zanaty: the issue is even worse because two-byte header support is not not mandatory Magnus: we suggest we clarify 5285 Roni Even: +1 Colin Perkins: +1 Stephan Wenger: Deprecate the one-byte headers? Is there much deployment of it? Many people: yes. Jonathan: Most current deployment is one-byte, not clear there is deployment of two-byte Mo: Suggest to mandate support for both extensions and then require that two byte header extensions use IDs bigger than 15 so that they can be negotiated. Magnus or Roni will submit 5285bis to AVTCORE Jonathan: Who read this? 5/6 hands Jonathan: Who will review? Simon Perreault Jonathan: many things would likely want to use this. Roni: there’s also a dependency in clue Colin: We should check if something in bundle would also need to be fixed to address the issues Magnus pointed out Jonathan: OK, we are ready for last call! ========================================== Layer Refresh Request (LRR) RTCP feedback message Jonathan Lennox ========================================== Action Items: Bernard Aboba and Mo Zanaty to review H.264-SVC layer refresh text. Stephan to work with JCT if a new SEI message is needed with H.265 for temporal layer refresh. Jonathan will work with co-authors for VP8 text. Jonathan to send text to the list about what FIR means for MRST/MRMT. Discussion: Jonathan: this is now a WG document Deltas: not many. main one: recognizing layer refresh point H.264/SVC is complete ACTION: Bernard A. and Mo Z. are volunteering for reviews of this part H.265 Stephen Wenger: is anyone interested at all in non-scalable nested coding structures? Jonathan: yes for VP8 and VP9. Stephen: given current paces of SDOs, that’s likely going to exist in H.265 as well so I volunteer doing that. ACTION ITEM: Stephan will take of this for JCT ACTION ITEM: Jonathan will nudge colleagues to do reviews for VP8 ACTION ITEM: review the description of how you recognize layer refresh? Mo: not clear the VP8 Y bit is sufficient to recognize all temporal sync points. Jonathan: false negatives aren't the end of the world, false positives are bad. Next slide: Temporal switch points are complicated! Justin gets up to explain … says he’s confused Mo explains: these are just three layers of reference frames Open issue: what does FIR mean for spatial Multi stream scalability QUESTION to WG: Refresh layer or refresh the whole source? Only relevant for H.264-SVC. Question: do we want to define this? Do we want to do this here? Stephan Wenger: wasn’t there some kind of consensus somewhere (payload) that new payload formats should have some form of FIR mapping. Jonathan: shakes head Stephan: an issue for SHVC payload format. Stephan: I will talk to Ye-Kui. Jonathan: Two issues: what should the semantic be, and how is it instantiated for any given codec. My recommendation: FIR means refresh all layers and we need to define something else for per layer. Mo: I sort of thought that’s what it meant already. After all it’s a FULL intra refresh Bernard: For IMTC structures it doesn't make a difference. Jonathan: If no one is currently doing anything where it makes a difference, not an issue. CONSENSUS: FIR is a full refresh. Jonathan Question: where do we say that? Mo: let’s just have this doc update 6190 Jonathan: probably update CCM, not 6190. Some more discussions showing there’s actually no consensus. Justin: Does this mean that FIR has different behavior for MRST than for simulcast? Jonathan: Yes. Justin: I think that's confusing and awkward. Mo: Maybe have FIR refresh all simulcast streams? [Unhappiness] Stephan: Multiple prediction coding coming. Jonathan: My inclination is to say that MRST and simulcast are different, despite the assymetry. Stephan: Possibly recommend that FIR be sent for all layers simultaneously. Justin: Can we just say that LIR is per-layer and FIR is per-SSRC? Jonathan: For MRST/MRMT those are the same thing. Stephan's suggestion would be straightforward for MRST, but for MRMT the multiple FIRs will not be the same packet so there might be sync issues. Magnus: write it up and send this to the list (ACTION ITEM) Stephan agreed Jonathan agree: will do! ========================================== RTCP feedback message for image control Roni Even ========================================== Action items: Roni: Consider timestamp for which picture request is relative to Roni will draft a liaison statement for 3GPP Discussion: Motivations: a picture of Matt Damon also: get a detailed image for part of an image. camera zoom and move. zoom on participant in an MCU Proposal: have an RTCP message the describes the area they want zoomed or moved. Requires consent of sender. Reference is based on current picture/quality rate There’s also a notification/ack message from the sender Mo: question. you shouldn’t do this with absolutes but with relatives because resolutions change. Roni: idea is to have pixels based on the current view. Mo: but there’s no way to know which frame this relates to Randell J (through Jabber): same question pointing out the race conditions Ben: coordinates are unsigned? Roni: can be negative so that you can move outside of your view Stephen: using reference timestamps should fix most of the race conditions Roni: Agreed, we will consider this (ACTION ITEM) Randell: offsets could be made floats (0-1 of the source width) and be relative to the absolute source Peter T. question: Is this a feedback or a control message? Roni: it’s both Peter: is this weird? Roni: no Peter: how do you do reliability? Roni: we have an ICN response message. also Colin: also there’s a standard convention for doing this Roni: yes, that’s what we are using here Peter T: how are you referring to the stream that you mean? Roni: by SSRC Peter: what if the SSRC changes? Roni: then you're receiving a new stream so you have a new view Roni: Relatedly, in multipoint BFCP allows you to say who controls the camera Stephan Wenger: I think the document is underspecified. You need to say whether your source window is with respect to decoded bits or only with respect to bits that are meant for human consumption Ben C.: clarification on the IANA issue - they registered parameters with IANA, but they're held up because we don't have expert reviewers. Roni: yes, I was just saying this is why I didn't know they had done it. Ben C.: We also need to think about whether we want to continue work that's divergent from what they've done, whether there's harm done by having two ways of doing things. Magnus: we need to feedback our criticism to the original authors before we start this work Ben: Is this in a release? Magnus: Yes, but they have a process for fixing it. Mo: suggestion about semantics on the ICN message: would be useful if sender gave current viewport Roni: makes sense Randell (via Jabber): we already do dynamic resolution scale in Firefox, so using non-pixel-based values is a win. Also means you don't have to have memory of sizes, and avoids confusing when using layered encodings. Jonathan: I think we are better off helping 3GPP fix their own message than doing our own competitive message, if possible. Cullen: +1 Jonathan: Rachel and Roni, go to a 3GPP meeting and help them fix this. Also, please draft text of a liaison statement. Roni: can we ask if people care about this? Magnus: ok, WG QUESTION: WHO CARES ABOUT THIS AT ALL? Cullen: maybe some of our video surveillance people would. Bo: maybe I care too. But I think we should try to work it out in 3GPP. (PROMISE) Jonathan: you could be the person who catches the liaison statement in 3GPP? Bo: Myself or a close colleague. Stephan Wenger: Let’s stop using poor design of H.281 as an excuse. No one has given valid reasons for replacing it. Jonathan: The 3GPP spec recomments both H.281 and an RTCP feedback message, presumably for different circumstances. Stephen Botzko: we tried to replace this with H.282 but no one implemented it. So I agree with Stephan. ACTION ITEM AND CONSENSUS: Roni will write a liaison to 3GPP and we’ll see what happens there ========================================== Video Frame Info (RTP Header Extension) Mo Zanaty ========================================== Status: Consensus to adopt as WG document. Action Item: Chairs to work with ADs to add milestone. Discussion: Motivation: we need payload agnostic RTP switching (like in SFUs). The reason is that payload is often encrypted and even if not it could be an unknown format. Peter Thatcher: original idea was to be codec-agnostic, but there is codec-specific information? Mo: If you need spatial/quality information, yes. H.265's combined layerId forces this. Peter: Would it be possible to have some way to specify the semantic explicitly? Mo: I'm not sure how that's easier than doing it per-payload-type. Perhaps we could write recommendations for how future codecs do things. Stephan Wenger: Some stuff is still missing in this document. Design suggestion: let’s just say, you grab the first N bits of your codec packet and copy it here. This is not future proof and we can make it future proof Jonathan: you would certainly need TL0 pic index. For spatial scalability would also need something equivalent to the VP9 scalability structure / scalability update which tells you what layers you can be expecting to receive. Also suggestion: you could restrict this to temporal only as the behavior is much better defined and understood. If we do keep these layer IDs, we could arrange things so they are the same as in the LRR spec. Bernard Aboba: +1 on the TL0 pic index. Mo: OK, I’ll take this up on the list Jonathan: how many people know what this is about? MANY how many people think we should work on it? MANY anyone thinking we shouldn’t? Stephan: We need to understand Selective Forwarding Unit architecture better before we work on it. Bernard: We need this today, have multiple proprietary implementations of it, and we can learn more working on it. Mo: bernard’s document outlines a problem, this document can be viewed as a solution to that problem Stephan: we shouldn’t work on this until we study it well and know exactly what we want to do. Cullen: I am on the other end. let’s just get it done to address today’s problems. Bernard: The proprietary arches only solve the temporal case. This draft addresses today’s problems and even goes a bit beyond. So that’s good enough. Magnus: Who supports this as a WG doc? 10 PLUS HANDS Magnus: Against? none Magnus: will work with ADs to add milestone