Minutes for CLUE at interim-2012-clue-3
minutes-interim-2012-clue-3-2

Meeting Minutes ControLling mUltiple streams for tElepresence (clue) WG
Title Minutes for CLUE at interim-2012-clue-3
State Active
Other versions plain text
Last updated 2012-10-08

Meeting Minutes
minutes-interim-2012-clue-3

   IETF CLUE WG interim meeting, Sept. 19-20th, 2012
San Jose, CA
Hosted by Cisco

Attendees:
----------
Mary Barnes (Polycom) (chair)
David Benham (Cisco)
Stephen Botzko (Polycom)
Bo Burman (Ericsson)
Roni Even (Huawei)
Rob Hansen (Cisco)
Christer Holmberg (Ericsson)
Paul Kyzivat (Huawei) (chair)
Jonathan Lennox (Vidyo)
Andy Pepperell (Silverflare)
Erick Sasaki (NTT) (Wed only)
Stephan Wenger (Vidyo)
Paul Witty (Silverflare)

Webex:
------
Espen Berger (Cisco)
Gonzalo Camarillo (Ericsson) (AD)
Keith Drage (Alcatel Lucent)
Cullen Jennings (Cisco)
Dan Romascanu  (Avaya)

====================================================
Notetakers:  Mary Barnes, Rob Hansen, Andy Pepperell
====================================================

Recordings:
===========

Sept. 19th
----------
- Streaming:
https://ietf.webex.com/ietf/ldr.php?AT=pb&SP=MC&rID=8311957&rKey=01cb3a68a6ab26ec
- Download:
https://ietf.webex.com/ietf/lsr.php?AT=dw&SP=MC&rID=8311957&rKey=f34973ea517e24da

Sept. 20th
----------
- Streaming:
https://ietf.webex.com/ietf/ldr.php?AT=pb&SP=MC&rID=8329807&rKey=f23021d9f3acea02
- Download:
https://ietf.webex.com/ietf/lsr.php?AT=dw&SP=MC&rID=8329807&rKey=9ee5e7c7279affb3

Conclusions:
============
1) Ticket #16. Agreed to use the term "capture encoding". (Action ii)
2) Do we need additional facilities for the advertisement to indicate
limitations on simulcast?  Yes.  (Action v) 3) Do we need a way to reject a
Configure message? Conclusion: Yes. - Under what circumstances do you reject? 
Only when it is mal-formed or configure requests something that wasn’t
advertised. - What does it mean if you reject?  The generator of the reject
should send a new advertisement. - Do we need a explicit Ack?  Assume yes for
now.  (Action vii) 4) How do we signal support for Clue?  Conclusion: Define a
feature tag. (Action viii) 5) Ticket #15. What signaling protocol should be
used for the CLUE application specific information? Conclusion: General
agreement to use XML schema to represent the information to be carried in the
protocol. (Ticket remains open as we still need to re-consider reusing existing
protocols.) 6)  Is there an expectation on establishing a CLUE session that you
will immediately (after 200 ok) send/get an advertisement? Conclusion:  No. 7)
Empty Advertisement:  Is this allowed and what does it mean? Conclusion: Yes.
It means you have nothing you are willing to send. May also still want media in
other direction. 8) Should it be allowed to honor out-of-date advertisements?
Conclusion: Yes. Decision of advertiser. Would make sense to honor in
particular if you just sent an advertisement that only adds.  [Note:
renumbering a capture or a missing number indicates that you won’t send that
capture anymore.] 9) Are you allowed to send “plain” SIP/legacy media before
1st advertisement (i.e., between 1st O/A with m-line under CLUE control and
Configure) when you know it’s CLUE? Conclusion: Define empty advertisement and
default config  (e.g., one capture).  Perhaps just describe within use case
and/or state model.

New Issues:
===========

a) Need to decide whether advertisement is complete information "all" or just a
"delta"  [Note: current framework is "all"] b) Site Switching: there is an
issue when you have multiple captures and do site switching. It needs to be
consistent.  May need real-time updates for spatial information, time
synchronization, etc. - e.g., RTCP, XCON notifications. (Action vi) c) What is
advantage of not having a 3rd O/A exchange in the session setup? d) Must the
configure message be consistent with the current SDP? Needs further
consideration….e.g. order that things are changed, etc. Overarching objective
is to minimize what info might be duplicated. e) Need a mechanism to know which
m-lines are under CLUE control within SDP.  Also, need to define how this works
within O/A. f)  Can CLUE and BFCP control the same m-line?  How do CLUE and
BFCP interact? What resources do we want to control with a floor in CLUE? To
the level of capture (or scene)?   What are semantics?  Who drives?  Configure
or Advertisement? A use case would be helpful.  (Action xi) g)  Define a
(minimal) state machine.  (Note this relates to the CLUE instance and is couple
with protocol state machine).

Actions:
========
i) Chairs: add new issues to tracker
ii) Andy: Propose text/definition for "capture encoding" term (Conclusion #1).
Work with Mark to get framework updated. iii) Chairs: close ticket #16. iv)
Christian: Based on concerns raised in draft-groves-clue-scene-clarification,
point out places in the document that are unclear or propose additional text in
specific sections to clarify his concerns. v) Andy:  Need to define the
mechanism to indicate limitations on simulcast in the advertisement.
(Conclusion #2) vi) Jonathan:  Put out a proposal for discussion of Site
Switching (Issue b) vii) Andy: Add text to Framework for rejecting Configure
(Conclusion #3) viii) Roni: Feature tag to signal CLUE support - need to fwd
our requirements to the SCTP/UDP work to define a CLUE channel  (Conclusion #4)
ix) Keith:  Propose text as to how handling of out-of-date advertisements will
work. Describe how this optimizes glare as opposed to current model.
(Conclusion #8) x) Christer: propose a mechanism – e.g, grouping, labels or
whatever to know which m-lines are under CLUE control within SDP. (Issue e).  
Also, need to define how this works within O/A. xi) Keith: Propose A use case
with CLUE and BFCP controlling same m-line. (Issue f).

Way Forward:
============
Documents for the following (revised or new):
- Updates to framework for “capture encoding” terminology. (Andy)
- Draft-romanow-clue-call-flows – develop signaling model further (Rob,
Simon/Roberta) - Telemedical call flow doc – updated to include basic case and
more detail added. (Paul) - Draft-even-clue-rtp-mapping – updates based on
discussions including “capture encoding” terminology. (Roni, Jonathan) -
Working document for CLUE instance and State Machine. (Simon, Roberta) -
Signaling proposal based on a strawman for schema. (Simon, Roberta)

==============
Detailed notes
==============

Mary's notes (Sept. 19):
========================

Data model
----------
- consumer instance would be different than provider instance
- may advertise "all" (complete) information or only "delta"
- endpoint needs to keep track of config
- Stream instantiation is based on advertise, config, SDP info, etc.
- Consumer role is independent of Provider role and not coupled like SDP O/A
- (Roni) only info that is CLUE specific is spatial information
- CONFIGURE defines upper bound, but SDP limit may override
- Cullen: there will be problems if you have different limits

Ticket #16:
-----------
- Conclusion: agree the term "capture encoding"
Action: Andy. Propose text/definition on mailing list. Work with Mark to get
framework updated.  Chairs, close ticket #16.

Going off topic:
----------------
- Issue: Can the FW explicitly restrict the use of simulcast?
- Jonathan: Do we need a model (richer language) for advertisements to define
what is valid

CLUE scene clarifications (draft-groves-clue-scene-clarifications):
----------------------------------------------------------------
- discussing optimal versus alternative experiences
- issue with how to select if screen limited
-- Are alternatives only in capture scene?
-- composite versus separate scenes

- Site switching (and a reference to the draft in Stockholm).  There is an
Issue when you have multiple capture scenes and you do site switching. You
don't want to send everything.  May need to do realtime updates in RTCP (or
XCON pkg).  Objective is consistency (e.g., someone's arm is continuous frames
when it moves beyond its own area of capture)

Reviewed each of Christian's suggestions.  a), b) and d) are already part of
the framework.  The concept of a "capture group" c) was not agreed.  As for d)
it is not a MUST (it is a should).  Action for Christian to point out places in
the document that are unclear or propose additional text in specific sections
to clarify his concerns.

RTP
----
- Don't want every capture in SDP.  Provider needs to know what consumer can
receive.  Don't have a way for the provider to know consumer capabilities. 
Roni discussed the various drafts.

With regards to mapping option documents, Christer noted that
Draft-westerlund-avtcore-max-ssrc will be submitted to MMUSIC.

Roni clarified that these documents are there so they need to be considered
(not necessarily endorsing).

On the SSRC document (Draft-westerlund-avtcore-max-ssrc), Jonathan L asked
whether we negotiate the use of SSRC multiplex?  Do you already need to say
what m-line you need?  Answers: Yes.

Paul W: should be able to do CLUE w/o multiplex - just need multiplied m-lines
for each encoding group.

Can use labels ala BFCP.

[Note:  Andy's notes are quite thorough here, so I'm not going to type mine out]

Call Flows (Rob)
----------------

Must advertisement messages be within what the recipient has negotiated in SDP?
Must configure messages be within what the sender has negotiated in SDP?
- No.
If yes, what to do about middle boxes changing SDP parameters?
If no, what happens when someone asks for more than can be sent to them?
- Sender always needs to know BW.  Receiver does as well due to configure.

What happens when a new advertisement message is sent mid-call?
Can this invalidate the previous configure?
Must it invalidate the previous configure?
Should configure messages referring to previous advertisements be rejected?
- Yes - there should be a way to reject a configure message.

Telemedical Use Case (Paul)
---------------------------
- configure tied to SDP (e.g., m-line)

- An O/A should be consistent with the most recent advertisement in each
direction.  New O/A can come before the configure

Christer: must send Advertisement before O/A

Roni: advertise is peslgptng.  Config is different.

Should the O/A have info/linkage with the advertisement?

Do we need a way to reject a config?  Want an explicit rejection

Slide 5 (revised):  Do you have to send a config?  Before config s/b 0 captures.
- Must interwork with legacy.
- Need to consider pre-conditions

Conclusion: need a way to signal CLUE and not assume based on m-lines.

What is advantage of not having a 3rd O/A exchange in the session setup?

Must the configuration be consistent with the current SDP?

Signaling (Rob)
---------------
- Consensus:  AGree to use SCTP/UDP/DTLS as the transport. Noted that we could
share the channel with RTCWEB. - Action: take to list for confirmation (and
close Ticket #14) - Action: include motivation for this approach in the
documents

What CLUE-specific information should be in an initial SDP?

What additional CLUE-specific information should be in subsequent SDPs?

Must there always be a second round of SDP offer/answer?
Conclusion: No.

Andy's notes (Sept. 19th):
===========================
Day 1 (Wednesday)

Topic: Data model.

Objective is to try to agree the data necessary to be exchanged; CLUE instance
model vs messaging model.

Christer: need to understand what is meant by CLUE instance - shouldn't think
of what we should transport where, but what data is needed.

Jonathan: provider needs to know what it has sent to far end
general agreement that we shouldn't define in too much detail actual
implementations

Roni: do we assume that advertisement is valid until updated by provider. 2
parts to the state - what's advertised and what's chosen. Does 2nd
advertisement be a complete replacement or adding / removing sections? What
does a new advertisement mean - and if it's a delta model then we need to be
able to *remove* as well as add. PaulK: whether full state or difference is up
for discussion general agreement that this isn't something we need to decide on
right now - whatever we decide needs to be in the data model we can come up
with a "deltas" method for

Topic: Ticket #16

Mary: We had an e-mail thread about what to call "capture encoding"s - last
contribution was from Espen, and there's some discussion about perhaps using
"srcname" for this. Bo: srcname is hierarchical and so you could have all
captures generated from the same device (eg. a camera) would have most of their
srcname in common - e.g. "..///".

Jonathan: is it a problem with the model that simulcast is effectively always
enabled ("3 out of 9" example - is it possible right now as the framework
document stands for a provider to be able to say that it can provide any 3
captures our of 9 available but no single capture more than once). This is
believed not to be possible right now - any encoding group allowing multiple
encodings effectively signals that simulcast is possible for any captures
associated with that encoding group.

Andy / PaulW: perhaps we should add a restriction on the number of simulcasts
of each capture that's possible

General agreement that "capture encoding" is the best term we've come up with
for ticket #16 Andy action item: Need to make sure "capture encoding" is
defined and agreed on mailing list and then added to framework - stated intent
to consult with Mark Duckworth on this

Roni: think we lack the ability to express whether different capture scenes can
be encoded simultaneously or not. Andy: this should be possible today, given
that encoding groups are outside of capture scenes. Need to confirm that this
is the current stated definition / structure of framework / data model. PaulK:
if you receive an advertisement with multiple scenes, you should get an entry
from each scene, e.g. a video and audio capture scene entry for each capture
scene, where there might be one capture scene for "main" media and one for
"presentation" The issue raised that perhaps we need the concept of capture
scene alternatives, for instance a capture scene being able to be defined as an
alternative to one or more other capture scene - this would allow, say, a
provider to advertise a pre-composed combined main / presentation video stream.
At this point, introducing prioritisation of capture scenes was also proposed.

Topic: "Discussion of scene and capture scene entry concepts
draft-groves-clue-scene-clarifications-00"

This discussion focused on points "a)" - "d)" in the "2. Proposals" section of
the document.

a) and b) already a given - we believe the framework already says this (modulo
"capture devices" / "spatially related" tweaks) c) "capture group" - not seen
as a solution to this issue - no general support for adding the concept of
capture groups

Stephen: do need to solve the problem with real-time co-ordinate updates
d) no changes should result from this - perhaps a recommendation that
automatically chosen captures should always be complete capture scene entries.
Christian to clarify the meaning of this, or which pieces of text are wrong My
take away from this session was that we really need clarification from
Christian exactly whether points a) - d) were all intended to describe
modifications to the existing (framework) draft, and if so what deficiencies
they were intending to address.

Afternoon session

RTP - Roni and Jonathan

Introduction slide - trying to avoid duplication between SDP and CLUE.
Assumption - CLUE systems support different topologies: Point to point, Media
mixers, Media switching mixers, Source projection mixers Two SSRC behaviors:
static SSRCs (assigned by MCU mixer), dynamic SSRCs (original sources' SSRCs
relayed to participants, e.g. via CSRCs)

Most of the work so far has assumed session multiplexing rather than SSRC
multiplexing. Several relevant drafts for this sort of thing: source attribute
(RFC 5576) to describe attributes of RTP sources based on their SSRCs, RFC6236
for generic image attributes, draft-westerlund-avtcore-max-ssrc for multiple
SSRCs within an RTP session, draft-westerlund-avtcore-rtp-simulcast Cullen:
number of SSRCs is different to the number of streams - so not necessarily as
relevant for CLUE.

Cullen: seems to be a general need for more linkage between SDP and application
layers such as XCON, CLUE, webrtc - maybe we want to solve this in a general
way PaulK: support for SSRC multiplexing so far has been geared towards
decision being made by sender (at the static vs dynamic level, anyway). Do we
need to be able to specify an m-line to use when requesting a capture encoding?
General feeling is yes - might also need provider to signal which m-lines are
valid for each capture (or perhaps this coul dbe done at the encoding group
level?)

Mary's notes (Sept. 20th):
===========================

Issue summary from Wednesday
--------------
- Reviewed and updated the list of issues from Wednesday (in the chairs charts)
1.  Need to decide whether advertisement is complete information "all" or just
a "delta"  [Note: current framework is "all"]

2. Can the FW explicitly restrict the use of Simulcast?

3. Site Switching: there is an issue when you have multiple captures and do
site switching. It needs to be consistent.  May need real-time updates (e.g.,
RTCP, XCON notifications).

4. Call Flow: Do we need a way to reject a Configure message?

5. Call Flow: How do we signal support for Clue?

6. Call Flow: What is advantage of not having a 3rd O/A exchange in the session
setup?

7. Call Flow: Must the configuration be consistent with the current SDP?

8. Ticket #12:  What CLUE information is carried in SDP?
-  What CLUE-specific information should be in an initial SDP?
-  What additional CLUE-specific information should be in subsequent SDPs?

9.  Ticket #13:  What CLUE information is carried in CLUE specific signaling?

10. Ticket #15:  What signaling protocol should be used for the CLUE
information?
 General agreement to use XML schema for the information.

11. What information is required to be supported (i.e., must specify whether
information is optional or mandatory)?

Telemedical Call Flow - simple version (Paul)
------------------------------------

- 1st Invite is before CLUE
- Christer: concerned about 491s
- Paul: could introduce a mechanism if we think there will be a glare problem -
can see CLUE glare on configure. - Rob: depends upon how much the SDP will
change based on the advertisement  as to whether you need a re-Invite. Must
re-invite if advertisement impacts SDP. - is there an order in which the
messages occur or must the CLUE entities be able to handle them in any order -
Andy: basic audio and basic video is optional. If you do that need to define
when that ceases. - Andy: do both advertisements need to occur before
re-Invite? - Discussion: NO. They are not synchronized - Roni: we're assuming a
system in which you have a SIP call for multiple cameras, etc. (versus multiple
SIP calls

More complex flow - MCU case (Paul K)
-------------------------
- Discussion of empty configure and empty advertisement.
- Christer and Roni: No.

Issues:
1) Is there an expectation on establishing a CLUE session that you will have an
advertisement. 2) Is it possible to send an empty advertisement (or should
you)? If so what are the semantics and when is it allowed/valid to do so?

Christer: need semantic for advertisement
Andy: Agree. Maybe we need to consider a media sync.  Don't need synchronous
advertisements Rob: empty advertisement means something bc you could have been
doing something thus empty advert stops that Rob: you can still send media w/o
CLUe messaging Roni: concerned about state mis-match Paul K: Depends upon what
state you are in.  Startup or default? Ronis: is this mode only if there wasn't
a previous advertisement? Rob: empty advert is "I'm not doing CLUE in this
direction". Roni: if you do that, can only do before a configure Jonathan: this
is like BFCP - if it has token and is controlling m-line.  Need to know which
m-lines are under CLue control. Paul K: Suggest an advert this is default (not
empty). Want a non-configure state for before 1st advert

Issue: Do you send media before 1st advertisement when you know it's CLUE?

Issue:  Can CLUE and BFCP control the same object/m-line?  Note: need to
associate a floor with a capture.

REview of issues (as summarized in chair charts):
-------------------------------------------------
- Note that conclusions and actions for the issues are captured in the chair
charts.   A couple of new issues did arise during this review that need to be
added to the list.

- Issue: What happens (i.e., what's the behavior) if a CLUE channel goes down? 
What does media do?  Should you send an advertisement or configure when the
CLUE channel comes back up? Action: We need to feed our Reqs into the SCTP/UDP
work in order to define a CLUE channel.

Rob's notes (Sept. 20th):
===========================

9am start, 2012-09-20

Mary presented some initial issues for discussion that came up yesterday:

* Need to decide if information is complete or a delta - framework currently
assumes all

* The framework defaults to simulcast being available unless specified
otherwise. It's not clear if it's possible to communicate all possible
restrictions a sender may have given the current encoding framework.

* Site switching: In the switched case we may want to send real-time updates of
the originating spatial information from the originating source.

* Do we need a way to reject a configure message (and what does it mean if we
do so)

* How do we signal support for CLUE

* What is the advantage of not having a third O/A exchange at the start of the
call

* Must the configuration be consistent with the most recent SDP

* Ticket 12: What CLUE information is carried in SDP

* Ticket 13: What CLUE information is carried in the CLUe-specific signalling

* Ticket 15: What signalling protocol should be used for the CLUE information
(general agreement on XML messages in SDP over UDP)

* What information is mandatory and what is optional

9:20

Paul had updated his call-flow, and stripped it down to now be a point-to-point
call. He now went through this.

There was concern over the likelihood of glare - alternatives such as mandating
an INVITE from one side seemed less good, though.

Andy pointed out that this was the first time we'd started talking heavily
about new O/As to add m-lines; we had always stated that these were needed but
this is the first time we'd taken to discussing it. There was agreement that
while in most cases implementations will probably use one m-line per media type
we need to provision for the more complicated case.

Paul then went on to the draft itself, which contained the more complicated
version of the call-flow.

Christer wondered why the MCU was sending an initial empty advertisment rather
than holding off until a second party called in.

Issue: On establishing a Clue channel, should there be an expectation that both
sides immediatelly send an advertisment message.

Issue: Is it possible to construct an 'empty' advertisment that advertises that
nothing is available.

There was debate about what the state is when the call starts but before an
advertisment/configure is done. Paul K suggested that the default advertisment
could match the RTP state (eg, it would be if it wasn't there). The alternative
was that the default state should be that nothing can be sent until the far end
has sent a configure in response to an advertisment.

Rob also asked if m-lines should be explicitely bound (or unbound) to clue.
There was general agreement that this was probably the case.

Paul K asked if it would be necessary to associate an m-line with both clue and
bfcp. Jonathon pointed out that we would presumably need to associate captures
(or something else) in CLUE with the floors.

Issue: What is the 'default' state of the media before an
advertisment/configure is sent?

Issue: Do m-lines need to be explicitely bound or not bound to CLUE control?

Issue: How will CLUE and BFCP interact?

Break

Restart 10:47

Andy volunteered to take a look at whether limitations were needed for
simulcast and propose a change.

There was discussion of forwarding the original spatial information in the
switched conferencing case. While RTCP SDES was seen as a potential place, it
was not seen as a particularly good one. The roster list was seen as a better
candidate, though Andy pointed out that it could involve a lot of information.
Jonathon was volunteered to take charge of this issue and see if he could make
a proposal.

There was dicussion of whether it was necessary to reject a configure message.
There was debate over whether a configure should refer specifically to a
previous advertisment and be rejected if it is not the most recent, or whether
advetisments should be constructed so that changes would be unambiguous when a
configure was received (and configurations that couldn't be fully met would
only send the portions that *could* be sent). There was also debate over
whether a malformed configure should be rejected; Keith was concerned that
adding rejection messages made CLUE into a negotiation protocol, which is SDP's
job. Finally, there was debate about potentially unsyncronised state (for
instance, because the transport dropped and was reestablished). Most people
felt we would need an explicit ACK/NACK, though this was not unanimous.

There was discussion of feature tags. People generally felt that feature tags
would be useful (and would signal that CLUe is *possible*, not that its being
done), while options tag were not appropriate. This meant we wouldn't have
'require' functionality, but no one felt very strongly about that. People also
felt it was important that implementations should be able to tell that CLUE was
being used from the initial offer/answer (eg, without it being negotiated in
SCTP or something).

There was a significant disagreement over the need for consistency at all times
between configuration and current SDP - some people felt that configurations
should always be consistent with the current SDP, while others felt that this
requirement was unnecessary.

When talking about what information required by CLUE should be in SDP, Rob
wondered if SDP was the best place to relate CLUE encoding ID to RTP SSRC, as
it adds CLUE-specific information to SDP and couples the two together.

There was general agreement that while there are many decisions to be made,
having an XML schema as a strawman starting point.

Break

Restart 12:50

People explored whether at the start of a CLUE establishment a sender must send
an advertisment, and what the default state was before an advertisment was
sent. There was debate about whether implementations should be able to honour
out-of-date advertisments. There was discussion of whether it was better to
have a sequence number per advertisment (and each advertisment invalidates the
last - configurations referring to this are rejected) or whether capture ids
should be unique (so a configuration referring to a previous advertisment may
still be valid). Keith was tapped as proposing text for the latter to show its
advantages.

Christer agreed to look at a mechanism for specifying in SDP which m-lines are
under CLUE control.

There was discussion of what the default state is when the CLUE O/A is first
established. While the 'empty', no captures available state was the simplest,
it was pointed out that in escalating from CLUEless to CLUEful this would cause
a glitch that isn't strictly necessary. No one was sure how important a problem
this was to work around.

It was acknowledged that more investigation is required for how BFCP and CLUE
will interact.

Finally, action items were assigned - see the issue tracker.