A Real-time Transport Protocol (RTP) Header Extension for Client-to-Mixer Audio Level Indication
RFC 6464
Yes
No Objection
Note: This ballot was opened for revision 06 and is now closed.
(Gonzalo Camarillo; former steering group member) Yes
(Adrian Farrel; former steering group member) No Objection
I think Stewart's security point is valid, although I am not quite sure how this differs from simply raising your voice.
(Dan Romascanu; former steering group member) No Objection
(Jari Arkko; former steering group member) No Objection
(Pete Resnick; former steering group member) No Objection
(Peter Saint-Andre; former steering group member) No Objection
(Ralph Droms; former steering group member) No Objection
(Robert Sparks; former steering group member) (was Discuss) No Objection
The Security Considerations section sketches a scenario where an attacker sends high level indications, but encoded audio that is actually silent to suppress other participant's audio. A more likely attack is one that sends high level indications just to seize any speaker-selection algorithm used by a conference system.
(Ron Bonica; former steering group member) No Objection
(Russ Housley; former steering group member) No Objection
(Sean Turner; former steering group member) No Objection
I noted the same things Stephen did.
(Stephen Farrell; former steering group member) No Objection
(1) If vad can expose encrypted vbr, then why don't the security considerations here say "if encrypting vbr and doing vad then you MUST use apply commensurate protection to both"? I don't get the logic in the current section 6 where it says "if encrypting vbr and doing vad then you SHOULD use some additional mechanism" - what's the exceptional case that justifies the SHOULD there and why would you ever do something appreciably weaker or stronger? (2) Is the alternative to srtp-encrypted-header-ext to use IPsec or TLS or what? It'd be better to reference those since if you don't then I don't get how srtp-encrypted-header-ext isn't a normative reference? I'd suggest adding a reference to either TLS or IPsec, whichever is more likely to be used.
(Stewart Bryant; former steering group member) No Objection
"A malicious endpoint could choose to set the values in this header
extension falsely, so as to falsely claim that audio or voice is or
is not present. It is not clear what could be gained by falsely
claiming that audio is not present, but an endpoint falsely claiming
that audio is present could perform a denial-of-service attack on an
audio conference, so as to send silence to suppress other conference
members' audio. "
... you could also dominate the conversation by always claiming that you have strong audio present.
=======
This security consideration from the mixer to client looks like it might be applicable
2. Furthermore, the fact that audio level values would not be
protected even in an SRTP session might be of concern in some
cases where the activity of a particular participant in a
conference is confidential. Also, as discussed in
[I-D.perkins-avt-srtp-vbr-audio], an attacker might be able to
infer information about the conversation, possibly with phoneme-
level resolution.
(Wesley Eddy; former steering group member) No Objection