Skip to main content

RTP Payload Format for the Opus Speech and Audio Codec
RFC 7587

Yes


No Objection

(Alia Atlas)
(Alvaro Retana)
(Deborah Brungard)
(Jari Arkko)
(Joel Jaeggli)
(Martin Stiemerling)
(Terry Manderson)

Note: This ballot was opened for revision 08 and is now closed.

Barry Leiba Former IESG member
Yes
Yes (2015-04-07 for -08) Unknown
Very clear document; well written.

The Abstract and Introduction both understate what this document is.  It doesn't just define the payload format and registrations, but also provides what I would consider an Applicability Statement in Section 3: normative, Standards Track advice about how to use the Opus codec, complete with MUST and SHOULD and SHOULD NOT.  That's fine, but both the Abstract and Introduction should say that.

-- Section 4.1 --

   Opus supports 5 different audio bandwidths, which can be adjusted
   during a call.

Wouldn't "during a stream", or "during active streaming", or perhaps "during an RTP session", or some such be better than referring to "a call"?

-- Section 5 --

   It is RECOMMENDED that senders of Opus encoded data apply congestion
   control.

Does this SHOULD come with any reference to how to do congestion control?  The previous paragraph has some vague statements about doing congestion control, but is there anything more concrete that I could refer to if I were implementing this?

-- Section 6.1 --
I see that the document shepherd asked for a media-types review in December, and that there were no comments.  Thanks for making sure that got done, Ali.
Ben Campbell Former IESG member
Yes
Yes (2015-04-01 for -08) Unknown
Section 7.1 has some normative language that seems more about the meaning of parameters than about SDP Offer/Answer considerations. That might be more appropriate in the parameter definitions.

It might be worth having the security considerations mention the VBR security discussion in section 3.1.2.

Section 3.1.3, last sentence:

s / restraints / constraints
Alia Atlas Former IESG member
No Objection
No Objection (for -08) Unknown

                            
Alvaro Retana Former IESG member
No Objection
No Objection (for -08) Unknown

                            
Brian Haberman Former IESG member
No Objection
No Objection (2015-04-08 for -08) Unknown
I agree with Barry that Section 3 should be explicitly called out as an applicability statement.
Deborah Brungard Former IESG member
No Objection
No Objection (for -08) Unknown

                            
Jari Arkko Former IESG member
No Objection
No Objection (for -08) Unknown

                            
Joel Jaeggli Former IESG member
No Objection
No Objection (for -08) Unknown

                            
Kathleen Moriarty Former IESG member
(was Discuss) No Objection
No Objection (2015-04-20) Unknown
Thank you for adjusting the text in the security considerations section to encourage the use of strong security mechanisms.
Martin Stiemerling Former IESG member
No Objection
No Objection (for -08) Unknown

                            
Spencer Dawkins Former IESG member
No Objection
No Objection (2015-04-06 for -08) Unknown
Meta-comment - it looks like the notification field is pointing only to Ali. If that's intentional, rock on, but I've had some accidentally-minimal notification lists on drafts recently, and wanted to mention that.

This draft seemed very clear to me, one not skilled in the art. Thank you for that. 

I have a few comments. If you have questions, please ask, but they're non-blocking (so do the right thing).

In this text:

3.  Opus Codec

   Further, Opus allows transmitting stereo signals.
   
I wasn't sure what this was telling me until I got to section 3.4, and saw this text: "signaled in-band in the Opus payload". Perhaps adding that phrase in Section 3 would be helpful?

In this text:

3.1.2.  Variable versus Constant Bitrate

   One
   reason for choosing CBR is the potential information leak that
   _might_ occur when encrypting the compressed stream.  See [RFC6562]
   for guidelines on when VBR is appropriate for encrypted audio
   communications.
   
I THINK I know what "potential information leak" means in this case, but I'm guessing. I learned a lot from the [RFC6562] reference. Is it possible to provide a short clue here? Would it be correct to say something like "One reason for choosing CBR is that some codecs have been observed to provide predictable VBR patterns for highly structured dialogs where only a few phrases are expected, and potentially leaking information without requiring an eavesdropper to decrypt the payload"?

Also in 3.1.2:

   The bitrate can be adjusted at any point in time.  To avoid
   congestion, the average bitrate SHOULD NOT exceed the available
   network bandwidth.  
   
I'm kind of wondering how you intend for the sender to know what the "available network bandwidth" is. I notice a reference in Section 3.3 to 

   (2) an externally-provided estimate of the
   channel's capacity; 
   
Is "the available network bandwidth" this externally provided estimate? What should a sender be looking at, to fulfill this SHOULD NOT?

Of course, I'm wondering why this is SHOULD NOT, instead of MUST NOT. Is this recognizing that available network bandwidth can change instantaneously (so a careful sender might still find itself sending too fast for a short period of time), or is there something else going on?

As long as I'm mentioning section 3.3, if the "externally-provided estimate" had some reference, that would be helpful (unless, of course, this is obvious to those skilled in the art). The term only appears twice in 3.3, with no pointers.

In this text:

5.  Congestion Control

   It is RECOMMENDED that senders of Opus encoded data apply congestion
   control.
   
Is there a particular mechanism you're thinking of here? If you could make that clearer, even just providing a reference, that would be helpful for me as a reader.

In other news, I support Kathleen's Discuss on SHOULD SRTP, and if it becomes a SHOULD and not a MUST, I'd suggest adding a sentence or two explaining why not using SRTP is the right thing to do.
Stephen Farrell Former IESG member
(was Discuss) No Objection
No Objection (2015-04-07 for -08) Unknown
- 3.1.2: s/_might_ occur/occurs/ the information leak does
occur unquestionably, the only uncertainty is whether or not
someone cares about that, whereas the current text implies
that the leak might not be real. As written, the text is
misleading, though not sufficiently to make this a DISCUSS.
(I am assuming there is no result that shows that encrypted
OPUS traffic is somehow not leaking information in the way
other codecs do.)

- 6.1, maxptime and ptime: I read this as saying that 3, 6,
9 etc are all valid values, but that e.g. 43 is not. Is that
correct? In other words can I mix frames that contain
different durations of audio into one RTP packet? (You may
have told me earlier, but I've forgotten already, sorry:-)
Terry Manderson Former IESG member
No Objection
No Objection (for -08) Unknown