NetVC IETF 99 Monday 17 July 2017, Afternoon Session II -------- Agenda -------- * No agenda bashing * Tim Terriberry will present Daala and Thomas Daede’s slides -------- Chair Slides -------- Video codec requirements an devaluation methodology: updated Alexey Filippov, Andrey Norkin, Jose Alvarez Discussion: * Mo Zanaty: requirements document is ready for progressing; currently at version section most changes around section 3.1.1 - no other substantive changes - last call at end of may - current status: shepherd write-up and passing it on to AD * Earlier question: whether or not we publish it - main impetus is that it’s also be used by other bodies -------- Test and Evaluation Criteria -------- draft-ietf-netvc-testing PRESENTER Tim Terriberry (slides by Thomas Daede) NETVC Testing * not a lot of changes to testing documents * have started exercising some of the subjective testing procedures for it * added a subjective test set (small subset of objective test set) Statistical analysis * Generally 12 viewers needed for results to be significant * SP 50 will be changed in future * CDEF constrained directional enhancement filter; ended up being significantly better than CLPF for a number of videos * CLPF: These are all completed; you can still vote but the results have been calculated * Jonathan Lennox: do we really intend to have Sintel video only up there twice? * Answer: not sure Test: https://arewecompressedyet.com/ * Mo Zanaty: for new subjective tests, we will start forwarding to the list if people are willing to give their feedback on them -------- Thor Update -------- PRESENTER Steinar Midtskogen * No updates since IETF 98 (spring 2017) * Last consensus: have Thor and Daala converge * Wish list: a tool designed to improve screen content; this has not started yet * Concerns about buffer retirement: both filters can have vertical filtering originally fixed by restricting second; quick fix * Steinar tried to find another fix: combined two passes into one * Used new subjective test framework AWCY * Tests were done in AV1, but don’t think would be much difference for Thor * In all cases objective scores for CDEF are slightly better * high latency vs low latency results; high latency has more ties * Objective codec comparisons: did not use objective-2-fast b/c it breaks AV1 sometimes * AV1 compression history: decreased over the last year, compression gains are slightly more than 20%, most of that has come in the last three months * AV1 complexity history: y-axis is logarithmic, frames per minute not fps. In order to get a 20% compression gain, the complexity goes up by about 1000% * Tim Terriberry: not sure which commits Steinar measured, but there changes that allowed you to make much quicker selections; expansion probably made it much slower, then sped up, faster again * Steinar Midtskogen: Complexity could go down * Mo Zanaty: data points? Steinar Midtskogen: Twice a month, same configuration. Selected whatever was in the repository first on the 15th of each month -------- Codec Comparison: Thor, VP9, AV1 -------- * Thor and VP9 seemed to have same complexity and compression trade off except thor can have more compression at the cost of added complexity * AV1 performing better * If we limit sequence test set to screen content, Thor performs much better than VP9 but not as well as AV1 * It’s possible to get thor to perform roughly as well as av1 but with a fraction of the tools and added complexity * Mo Zanaty: what amount of screen content is in earlier test set? * Steinar Midtskogen: at least one sequence had a BDR score of 80% better than Thor * Tim Terriberry: Wikipedia set (screen capture of someone scrolling through Wikipedia), a few Twitch videos (Minecraft) * Steinar Midtskogen: with CDEF we should get a slight improvement * Jonathan Lennox: you don’t anticipate any complexity costs? * Steinar Midtskogen: not that huge; for the entropy coder, some complexity but not a doubling or something like that. Screen content tool hasn’t been invented yet. -------- Daala Update -------- PRESENTER Tim Terriberry * This change is something we discovered while working with the VP9 selections * How this works for VP9 (VP9 slide) * Proposal for AV1, but AV1 has all the same problems as VP9 and more problems on top of that * Mo Zanaty: comment on resilience these frame numbers that have been added; you can have a much larger frame number 10 bit 12 bit if you drop one you actually know that you dropped one * Right. Wanted to have some consistent way of solving this problem; proposal * Before slide: basically the situation now. Each one has a buffer of actual pixels in it * After (proposed): move all the probabilities up into the reference frame; the global motion data moves up * Whatever is the first frame in your list of reference frame, you draw reference pixels, and all of your pixels,s probability, all of your motion data * Mo Zanaty: do you mean to say that before you can update a context after decoding a non-reference frame, but now you can’t? * TO DO slide: Global motion; relatively recent, proposal not complete yet, frame size prediction -------- Chroma From Luma -------- PRESENTER Tim Terriberry; Luc Trudeau, David Michael Barr did most of the work * Update to CfL proposal * This presentation topic: Solely used for intra-prediction * Originally designed Cfl to work within Daala. Hard to do in other codecs * A lot of Cfl proposals try to build a linear model implicitly from data—this is not very good * No longer require PVQ (perceptual vector quantization) b/c we’re doing everything in spatial domain * Decoder nice and simple, just use parameters that were sent * CfL: encoder side slide, to answer “Are the were models going to have some constant offset?” * feed into search for best linear parameters * A couple choices made for efficiency reasons * Mo Zanaty: one question about your alphas; have you ever looked at to see one plane is useful for another plane * We code them together, direction on that plane and magnitude in that direction; probability will increase to the extent that those are correlated * Boundary handling complicates things b/c (see slide) * 1 pixel, has a small effect on metrics, none visible in picture * Steinar Midtskogen: does it make sense to use something alpha values to drive predictions [missed] * Tried in daala, didn’t help. May be worth revisiting in AV1; Answer is maybe.