Skip to main content

Last Call Review of draft-ietf-clue-framework-24
review-ietf-clue-framework-24-secdir-lc-hallam-baker-2015-12-03-00

Request Review of draft-ietf-clue-framework
Requested revision No specific revision (document currently at 25)
Type Last Call Review
Team Security Area Directorate (secdir)
Deadline 2015-12-01
Requested 2015-11-19
Authors Mark Duckworth , Andrew Pepperell , Stephan Wenger
I-D last updated 2015-12-03
Completed reviews Secdir Last Call review of -24 by Phillip Hallam-Baker (diff)
Opsdir Telechat review of -24 by Éric Vyncke (diff)
Assignment Reviewer Phillip Hallam-Baker
State Completed
Request Last Call review on draft-ietf-clue-framework by Security Area Directorate Assigned
Reviewed revision 24 (document currently at 25)
Result Has issues
Completed 2015-12-03
review-ietf-clue-framework-24-secdir-lc-hallam-baker-2015-12-03-00
I have reviewed this document as part of the security directorate's

ongoing effort to review all IETF documents being processed by the

IESG.  Document editors and WG chairs should treat these comments just

like any other last call comments.

This standards track document describes what is essentially an enhanced data
model for negotiating telepresence configurations in cases where a given party
may have multiple capture devices offering multiple streams. Choice of streams
may be constrained by device capabilities. A camera may offer a closeup of the
speaker or a wide view of the panel but not be capable of providing both.

Security considerations.

One context issue I am having here is understanding what the relation of this
document is to the others it is referencing. For example, there is a normative
reference to

draft-ietf-clue-protocol-06

. Is that to be considered by the IESG at this point? If so it does not have a
security considerations.

If the point is to publish the framework doc as an RFC so as to set the context
for further discussions of the protocol, this is OK. But otherwise there is a
normative reference to a document that doesn't have a security considerations
section and desperately needs one.

This is a big problem as the Security Considerations section in framework is
pointing forward to 'authorization mechanisms' that are presumably to be
described in protocol.

Given this situation, these comments may be taken as input to the framework doc
or the documents to be written using framework as the architecture.

As a general matter, it would be easier to analyze security if terms such as
'confidentiality' and 'integrity' were used. This is particular the case when
the specification in question is dealing with audio and video. for example the
phrase

"

an endpoint attempting to listen to sessions in which

it is not authorized to participate" is almost certainly intended to cover
video as well which is seen and not heard.

Looking at the considerations in this way gives us the following considerations:

Confidentiality:

   Disclosure of media streams to an unauthorized endpoint.

   Disclosure of metadata to capture devices.

   Failure to terminate access to media streams at completion of a session.

Integrity

   Modification of media stream data

   Introduction of spurious media streams.

Service

   Denial of Service against capture devices

   Denial of service against output devices

I think this approach would be helpful when it comes to writing the protocol
authorization sections.

As a general rule, the term 'endpoint' is now meaningless and should not be
used. Yes, end-to-end security is a good thing. But you show me which are the
'endpoints' here.

End to end is Alice's brain to Bob's brain.

Between that we have mouth/face -> cameras/ mics -> capture host(s) ->
inter-network -> output host(s) -> displays/speakers -> eyes/ears.

An attacker may target any of those modules and any of the interfaces between
them. Using the term 'endpoint' is ambiguous.

The metadata disclosure problem can be quite insidious. Let us say we are using
CLUE to collect media streams from a home security grid. I have 11 cameras on
the perimeter pointing in and another 7 on the residence pointing out and one
on my desk. The one on my desk can be considered to be trustworthy, if someone
has compromised that, I am screwed. But that isn't the case for the perimeter
net which is cobbled together from Raspberry Pis and cheapo cameras. That net
is placed in a location I know is vulnerable.

Lets say we have an intrusion. First thing I do is to fire up a conference call
with my security contractor. I don't want someone to be able to compromise one
of my perimeter cameras in a way that tips them off to the fact the intrusion
has been detected.

Introduction of spurious streams might be one of the best ways to attack a
conferencing system. If I can see the main speaker and the audio is a little
fuzzy, attacker introduces an additional stream with filtering that makes it
more attractive to whatever AI is managing the conference. Now the attacker can
literally put words in people's mouths. Could be fun for politicians giving
town halls.