Protocol for Controlling Multiple Streams for Telepresence (CLUE)
draft-ietf-clue-protocol-19

Note: This ballot was opened for revision 17 and is now closed.

Alvaro Retana No Objection

Benjamin Kaduk (was Discuss) No Objection

Comment (2019-12-09)
I'm balloting No Objection, as the concerns noted in my previous Discuss ballot
position are not necessarily blocking issues for an Experimental document, and
in fact further experience might be helpful to reveal the appropriate values and
behaviors to use for timeouts and loop-avoidance.

My previous ballot position is preserved below; I note that some of the "substantial
comments that do not rise to Discuss level" seem to still be present in the -19.

DISCUSS

Thanks for the generally clear and well-written document!
I would like to discuss whether there needs to be more prominent coverage
of timers/timeouts, especially as relating to the state machines.  (I'd be happy
to learn that this is well-covered elsewhere in the document set; I just haven't
run into it yet.)

In a similar vein, do we want to have any treatment of avoiding infinite loops
(e.g., when a 'configure' or 'advertisement' is rejected in expectation of modification
but the sending implementation continues to generate an identical message)?

It is not clear to me that any change to the document text is needed in either case,
but I don't know to what extent the topics have already been discussed.

COMMENT

I also have some substantial comments that do not rise to Discuss-level.

How do I know which endpoint is the channel initiator and which is the
channel receiver?
draft-ietf-clue-signaling suggests that the DTLS client is the channel
initiator, but even that is not explicit about it -- the protocol could be
considered under-specified if there is insufficient clarity.

Section 2

The MCU definition doesn't actually expand the acronym, which seems a
little reader-unfriendly.

Section 5.1

There are perhaps more XML extension points in here than is reasonable for
some of these elements (e.g., <versionsListType>).

Section 5.2

                           If the responseCode is between 200 and 299
   inclusive, the response MUST also include <mediaProvider>,
   <mediaConsumer>, <version> and <commonExtensions> elements;

Maybe re-mention that MP and MC are booleans here.

   Finally, the commonly supported extensions are copied in the
   <commonExtensions> field.

Does this need to say that only extensions that are applicable to the
negotiated protocol version are included?  (Also, how does one handle an
extension that exists for multiple major versions -- are there two
<extension> elments for it in the <options> message?)

   Upon reception of the 'optionsResponse' the version to be used is the
   one provided in the <version> tag of the message.  The following CLUE
   messages MUST use such a version number in the "v" attribute.  The
   allowed extensions in the CLUE dialogue are those indicated in the
   <commonExtensions> of the 'optionsResponse' message.

Couldn't this restriction on the "v" value apply even to the
'optionsResponse' message?

Section 5.3

   The 'advertisement' message is used by the MP to advertise the
   available media captures and related information to the MC.  [...]

I'd consider avoiding the definite article "the" to refer to MP/MC roles,
since in many caess there will be 2+ of each, and we don't want to confuse
the reader into thinking that there is an MP/CR equivalence or something
like that.  So, perhaps "each MP" and "the corresponding MC".

Section 5.4

                             As it can be seen from the message schema
   provided in the following excerpt (Figure 6), the 'ack' contains a
   response code and a reason string for describing the processing
   result of the 'advertisement'.  [...]

[the reason string is part of the base clueResponseType]
The text quoted here could be read as implying that the reason string is
required in the 'ack' message, a stronger requirement than of the base
clueResponseType where it has minOccurs=0.  Some greater clarity in the
text here is probably called for, especially since when the 'ack' is
piggybacked on a 'configure' message, there is no provision for a reason
string at all. 

Section 5.5

                             The <ack> element MUST NOT be present if an
   'ack' message has been already sent back to the MP.

I think you need to clarify that this is scoped to the current
advSequenceNr.

Section 5.6

                                             It contains (Figure 8) a
   response code with a reason string indicating either the success or 
   the failure (along with failure details) of a 'configure' request
   processing.  [...]

[Same comment about reason string as for 'ack']

Section 5.7
   
   Such new response codes MUST NOT overwrite the ones here defined and
   they MUST respect the semantics of the first code digit.

nit: is this "overwrite" or "override"?
   
   This document does not define response codes starting with "1", and
   such response codes are not allowed to appear in major version 1 of
   the CLUE protocol.  The range from 100 to 199 inclusive is reserved
   for future major versions of the protocol to define response codes
   for delayed or incomplete operations if necessary.  Response codes
   starting with "5" through "9" are reserved for future major versions
   of the protocol to define new classes of response, and are not 
   allowed in major version 1 of the CLUE protocol.  Response codes 
   starting with "0" are not allowed.
   
This text seems to also preclude extensions to major version '1' from
defining 1xx or [5-9]xx reason codes; is that the intent?

Section 6

   When the CLUE data channel set up starts ("start channel"), the CP
   moves from the IDLE state to the CHANNEL SETUP state.

nit: only one of "sets up" and "starts" is needed.

   When in the ACTIVE state, the CP starts the envisioned sub-state
   machines (i.e., the MP state machine and the MC state machine)
   according to the roles it plays in the telepresence sessions.  Such
   roles have been previously declared in the 'options' and
   'optionsResponse' messages involved in the initiation phase (see
   OPTIONS sections Section 5.1 and Section 5.2 for the details).  [...]
   
My reading of the initiation phase is that each CP sends only a boolean
indication of whether it supports the MP/MC roles, and so each party has to
determine on its own whether it will act as a MP and/or MC; is that
correct?  If so, do we need to say anything about how the boolean matrix
translates to activating the respective sub-state machines?

Section 6.1

                         'configure+ack' messages referring to out-of-
   date (i.e., having a sequence number equal to or less than the
   highest seen so far) advertisements MUST be ignored, i.e., they do
   not trigger any state transition.  [...]

Is this really less than or equal or just less than?  Also, is "seen" the
right verb, since IIUC these are sequence numbers that the MP has
*generated* in its advertisements?

Section 7

                                                          In other
   words, in this example, the MP MUST use version 1.4 and downgrade to
   the lower version.  [...]

nit: does the phrase "and downgrade to the lower version" add any value
here?  The word "downgrade" can have negative connotations in some other
contexts, so if it's not adding value I'd suggest avoiding it.

Section 8

   As reported in Figure 13, the values of the fields of the <extension>
   element (either new information or new messages) take the following
   values:
[...]
   o  the major standard version of the protocol that the extension
      refers to.

The XSL includes a full version (including minor), even though the
semantics basically only use the major version.  That said, why is the
'version' element minOccurs="0" -- what are the semantics when it is
absent?

Section 8.1

   The CLUE data model document ([I-D.ietf-clue-data-model-schema])
   envisions the possibility of adding this kind of "extra" information
   in the description of a video capture by keeping the compatibility
   with the CLUE data model schema.  [...]

nit: I don't think this is grammatical; maybe just "keeping compatibility".

Section 10

This claims to be a "call flow" example, but the described flows only
contain a single unidirectional media flow, which is not really consistent
with the normal usage of the word "call".  Buried in the body text there is
a disclaimer:
   [...]                      For the sake of simplicity, the following
   call flow focuses only on the dialogue between MP CP1 and MC CP2.
I would suggest making the presence of this simplification much clearer
from the start, perhaps "CLUE protocol messages exchanged in the following
call flow are detailed; only one direction of media is shown for
simplicity, as the other direction is precisely symmetric".

   CP2 acknowledges the second 'advertisement' message with an 'ack'
   message (Section 10.7).

   In a second moment, CP2 changes the requested media streams from CP1
   by sending to CP1 a 'configure' message replacing the previously
   selected video streams with the new composed media streams advertised
   by CP1 (Section 10.8).

This might be an appropriate place to indicate that media from the previous
configuration continue to flow during the reconfiguration process.

It might also be worth noting again somewhere in here (or a subsection)
that there are three (well, two, since we only show one direction of media)
distinct sequence number spaces per sender, and that the discontinuity
between Section 10.2 and 10.3's numbers is correct.

Section 11

Thanks for the well-thought-through security considerations; in combination
with the linked documents (particularly the framework), they cover all the
considerations (especially privacy considerations) that I had in mind.

Section 14.2

We appear to be citing both 5117 and 7667, whereas the latter obsoletes the
former.

Martin Vigoureux No Objection

Warren Kumari No Objection

Comment (2018-11-19 for -17)
I'd like to thank Benjamin for his detailed review! Also, Zitao Wang for the OpsDir review - it contains some useful nits, and I'd encourage the authors to address them.

(Ben Campbell; former steering group member) Discuss

Discuss [Treat as non-blocking comment] (2018-11-19 for -17)
Thanks for all the work on this over the years. I have a few concerns that I think require discussion prior to publication:

§5.7: Is the "description" field expected to be human readable? If so, are there internationalization issues to consider?

§6.2, 5th paragraph: This says that if you get an error back for a configure message, you send a new configure message. This seems likely to cause an infinite loop unless some guidance is given about escaping the loop when the endpoints cannot agree on a configuration.

§7:  I’m confused by the versioning mechanisms. This section requires an endpoint to ignore unknown elements, but it also requires the peer to downgrade to the highest shared version. These requirements seem to be at cross purposes. If the peer downgrades, one should only see unknown elements in the case of implementation errors. The requirement to ignore unknown elements does not come for free; nor does the requirement to downgrade.

§5.1 and §8: The use of the options message to negotiate extensions seems underspecified. How does an endpoint compare extensionType elements?  Is a spec required or expected? Is the extension spec expected to register the URI for schemaRef somewhere? Does this need to be in IANA?

(Adam Roach; former steering group member) Yes

Yes ( for -17)
No email
send info

(Alexey Melnikov; former steering group member) No Objection

No Objection (2018-11-20 for -17)
I support Ben's DISCUSS.

A couple of extra nits, which I think need addressing:

XML and XML Schema need to be a Normative References.

12.4.1.  CLUE Message Types

   +-------------------+-----------------------------------+-----------+
   | Message           | Description                       | Reference |
   +-------------------+-----------------------------------+-----------+
   | options           | Sent by the CI to the CR  in the  | RFCXXXX   |
   |                   | initiation phase to specify the   |           |
   |                   | roles played by the CI, the       |           |
   |                   | supported versions and the        |           |
   |                   | supported extensions.             |           |
   | optionsResponse   | Sent by the CI to the CR in reply | RFCXXXX   |

Should the above be: "Sent by the CR to the CI ..."

   |                   | to an 'options' message to        |           |
   |                   | finally estabilsh the version and |           |

Typo: establish

   |                   | the extensions to be used in the  |           |
   |                   | following CLUE messages exchange. |           |

Examples in Section 10 contain URI "http://wpage.unina.it/spromano/clue-protocol-17-schema-file.xsd". Maybe they shouldn't.

(Alissa Cooper; former steering group member) No Objection

No Objection (2018-11-20 for -17)
Please address the Gen-ART review comments.

Section 5.7: I don't understand why the 100-199 range is reserved for a specific purpose for a future major version rather than having codes in that range defined now. That is, if it is discovered that delay or incompleteness is useful to signal, why would establishing support for that necessarily require some other backwards-incompatible change to the protocol?

Section 8: In addition to Benjamin's question about the version I don't understand how the schemaRef having minOccurs=0 makes sense. Doesn't it need to be included?

(Deborah Brungard; former steering group member) No Objection

No Objection ( for -17)
No email
send info

(Eric Rescorla; former steering group member) No Objection

No Objection (2018-11-19 for -17)
Rich version of this review at:
https://mozphab-ietf.devsvcdev.mozaws.net/D3717


I have noted a number of points where I think this is not fully
specified. I am not marking this DISCUSS because I believe they are
easy to fix and assume the AD will take care of them.


IMPORTANT
S 5.1.
>      detailed in Section 5.7
>   
>   5.1.  options
>   
>      The 'options' message is sent by the CLUE Participant which is the
>      Channel Initiator to the CLUE Participant which is the Channel

How do I know who is the CI and CR


S 6.
>      moves from the IDLE state to the CHANNEL SETUP state.
>   
>      If the CLUE data channel is successfully set up ("channel
>      established"), the CP moves from the CHANNEL SETUP state to the
>      OPTIONS state.  Otherwise if "channel error", it moves back to the
>      IDLE state.  The same transition happens if the CLUE-enabled

Is it possible to re-start the setup phase? If not, perhaps you want
an ERROR state. If so, maybe describe how


S 6.1.
>      MP moves to the WAIT FOR CONF state.  If a NACK arrives (i.e., an
>      'ack' message with an error response code), the MP moves back to the
>      ADV state for preparing a new 'advertisement'.  When in the WAIT FOR
>      ACK state, if a 'configure' message with the <ack> element set to
>      TRUE arrives ("configure+ack received"), the MP goes directly to the
>      CONF RESPONSE state.  'configure+ack' messages referring to out-of-

What about a configure without an ACK?


S 6.1.
>      date (i.e., having a sequence number equal to or less than the
>      highest seen so far) advertisements MUST be ignored, i.e., they do
>      not trigger any state transition.  If the telepresence settings of
>      the MP change while in the WAIT FOR ACK state ("changed telepresence
>      settings"), the MP switches from the WAIT FOR ACK state to the ADV
>      state to create a new 'advertisement'.

What happens if I receive configure while in ADV?


S 7.
>      the lower version.  This said, and coherently with the general IETF
>      "protocol robustness principle" stating that "an implementation must
>      be conservative in its sending behavior, and liberal in its receiving
>      behavior" [RFC1122], CLUE Participants MUST ignore unknown elements
>      or attributes that are not envisioned in the negotiated protocol
>      version and related extensions.

This seems to mean that if you speak X.Y, then you need to minimally
have a map of all features in [0..Y-1]. Is that correct?

COMMENTS
S 5.1.
>     <xs:anyAttribute namespace="##other" processContents="lax"/>
>   </xs:complexType>
>                  Figure 3: Structure of CLUE 'options' message
>   
>      <supportedVersions> contains the list of the versions that are
>      supported by the CI, each one represented in a child <version>

Are these ordered?


S 5.1.
>      misinterpreting the contents of messages.  The minor version is
>      obviously less useful in this context, since minor versions are
>      defined to be both backwards and forwards compatible, but it is more
>      useful to know the highest minor version supported than some random
>      minor version, as it indicates the full feature set that is
>      supported.  The reason why it is less useful is that the value can in

I'm not quite following the juxtaposition of "full feature set" and
"backwards and forwards compatible". Can you spell out the guarantees
more precisely


S 5.2.
>      inclusive, the response MUST also include <mediaProvider>,
>      <mediaConsumer>, <version> and <commonExtensions> elements; it MAY
>      include them for any other response code.  <mediaProvider> and
>      <mediaConsumer> elements are associated with the supported roles (in
>      terms of, respectively MP and MC), similarly to what the CI does in
>      the 'options' message.  The <version> field indicates the highest

What does it mean to provide these for other codes?


S 5.2.
>   
>              Figure 4: Structure of CLUE 'optionsResponse' message
>   
>      Upon reception of the 'optionsResponse' the version to be used is the
>      one provided in the <version> tag of the message.  The following CLUE
>      messages MUST use such a version number in the "v" attribute.  The

What goes in the "v" attribute for this message?


S 5.5.
>      </xs:complexType>
>   
>                 Figure 7: Structure of CLUE 'configure' message
>   
>      The <advSequenceNr> element contains the sequence number of the
>      'advertisement' message the 'configure' refers to.

It would be useful to mention here how out of date configures are
handled.


S 6.
>      herein discuss the state machines associated, respectively, with the
>      CLUE Participant (Figure 10), with the MC process (Figure 11) and
>      with the MP process (Figure 12).  Endpoints often wish to both send
>      and receive media, i.e., act as both MP and MC.  As such there will
>      often be two sets of messages flowing in opposite directions; the
>      state machines of these two flows do not interact with each other.

This point would have been useful to make earlier.


S 6.2.
>      successfully agreed on the media streams to be shared.  Then, the MC
>      can move to the ESTABLISHED state.  On the other hand, if an error
>      response is received ("error configureResponse received"), the MC
>      moves back to the CONF state to prepare a new 'configure' request.
>      If a new 'advertisement' is received in the WAIT FOR CONF RESPONSE
>      state, the MC switches to the ADV PROCESSING state.

I agree with what others have said here. It seems like only certain
errors merit this.


S 10.
>   </xs:schema>
>   
>                    Figure 16: Schema defining CLUE messages
>   
>   10.  Call flow example
>   

This would be vastly more useful earlier.

(Ignas Bagdonas; former steering group member) No Objection

No Objection ( for -17)
No email
send info

(Mirja Kühlewind; former steering group member) (was Discuss) No Objection

No Objection (2018-10-31 for -17)
Update: Revising my discuss, as I missed the pointer to draft-ietf-clue-signaling. Sorry for that!

Two minor editorial commets:

1) In the terminology section: What does MCU actually stand for?

2) sec 4:
"CLUE protocol version numbers are
   characterized by a major version number (single digit) and a minor
   version number (single digit)..."
However, later on the text says that the numbers can also consist of multiple digits...

(Spencer Dawkins; former steering group member) No Objection

No Objection ( for -17)
No email
send info

(Suresh Krishnan; former steering group member) No Objection

No Objection (2018-11-19 for -17)
I found the document to be well written and easy to read. I just had a concern about the response codes. I did not find text either in this on in draft-ietf-clue-signaling that describes the behavior that results in a specific response code. e.g. 402 and 404 seem to have some overlap but there is no text for when a 404 will be sent to disambiguate it from 402. Likewise for the other response codes. I did not find any prescriptive behavior that results in these codes.

(Terry Manderson; former steering group member) No Objection

No Objection ( for -17)
No email
send info