An Extensible Messaging and Presence Protocol (XMPP) Subprotocol for WebSocket
RFC 7395

Note: This ballot was opened for revision 07 and is now closed.

(Richard Barnes) Yes

Alissa Cooper Yes

(Jari Arkko) No Objection

Comment (2014-07-07 for -07)
No email
send info
Dan Romascanu raised two issues in his Gen-ART review. I have not seen a response yet, but these points should be at least considered before approving the document.

(Alia Atlas) No Objection

(Benoît Claise) No Objection

Comment (2014-07-10 for -07)
No email
send info
As noted by Jürgen in his OPS-DIR review:

This document defines how to run XMPP over WebSockets. The intended
status is standards track. I believe the document is in a good shape
and basically ready to go. In particular, I do not see that the XMPP
over WebSockets specification creates any operational issues.

Some editorial nits:

- Sec 1: The term 'raw socket' can be potentially mis-understood,
  perhaps simply remove 'over row sockets' completely (I think the
  message of the sentence remains intact without these words).

- Sec 3.1: The text says that both client and server MUST have |xmpp|
  in the list of protocols for the |Sec-WebSocket-Protocol| header.
  The text does not detail what happens if this is not the case. Is
  there be a defined behavior if this protocol negotiation fails?

- Sec 3.6.1: There is a closing parenthesis missing at the end of the
  first paragraph.

- Sec 3.9: Word missing in "it MUST be enabled the WebSocket layer",
  perhaps you meant "it MUST be enabled _by_ the WebSocket layer"?

(Spencer Dawkins) No Objection

Comment (2014-07-10 for -07)
No email
send info
I found the document to be quite easy to read. 

I support Ted's Discuss on security.

My apologies for the newbie question, but in this text:

3.5.  Stream Errors

   Stream level errors in XMPP are terminal.  

is "terminal" a term of XMPP art, or does it mean "fatal"?

(Adrian Farrel) No Objection

(Stephen Farrell) (was Discuss) No Objection

Comment (2014-09-11)
No email
send info
Thanks for handling my discuss, the result looks fine to me.

--- OLD comments below, I didn't check 'em.

- 3.1 you should explain the |thing| notation (or reference an
explanation)

- 3.6.1 - does the see-other-uri interact with the SOP if
running say a JS client in a browser? How? (Is such an
implementation a target?)

- 3.8 - looks to me that this makes those two XEPs normative
references. Just saying MAY does not mean that you don't have
to read them to implement, you do. I hope that's not a problem
for you but don't see that it should be since those are stable
enough references I guess.

- 3.9 - you say a "server" MUST NOT advertise TLS - that seems
a bit wrong - perhaps that'd be better as "a websocket
server's listener MUST NOT..." since I could have another
listener in the same process even that does native
xmpp/tls/tcp as well, right?

(Brian Haberman) No Objection

Barry Leiba No Objection

Comment (2014-07-08 for -07)
No email
send info
It would be useful to add a sentence at the end of the introduction that tells people where to find the XSF XEP documents, perhaps including a root URI.

-- Section 3.1 --
The mechanism of using vertical bars instead of quotation marks jarred me at first, and I expected to see "|xmpp|" in the protocol.  It didn't take long to figure it out, but, as the notation is different to what we usually write, it might be useful to note it in Section 2, and explain when you use vertical bars and when you use quotes (I think the difference is protocol elements vs. prose).

It also might be useful to have a sentence introducing the example, which says, "The following is an example of a WebSocket handshake followed by an initial XMPP protocol exchange," or some such.  (Or you might make the example a figure, and put that in as a figure caption.)

-- Section 3.3.3 --

Editorial:
   The inclusion of XML declarations, however,
   is NOT RECOMMENDED as WebSocket messages are already mandated to be
   UTF-8 encoded and therefore would only add a constant size overhead
   to each message.

The subject of "would only add" is dangling.  I suggest this fix:

NEW
   The inclusion of XML declarations, however,
   is NOT RECOMMENDED, as WebSocket messages are already mandated to be
   UTF-8 encoded.  Inclusion of declarations would only add a constant
   size overhead to each message.
END

-- Section 3.6.1 --
Two nits:
1. "e.g." needs a comma after it (two places).
2. The first paragraph needs a second closing parenthesis before the final period.

A non-nit:
   If the server wishes at any point to instruct the client to move to a
   different WebSocket endpoint (e.g. for load balancing purposes), the
   server MAY send a <close/> element and set the "see-other-uri"
   attribute to the URI of the new connection endpoint

With respect to the "MAY": what are the other ways of accomplishing this?

I think there aren't any; I think the "MAY" applies to the fact that the server can instruct the client, rather than (as written) how it does it. But you already start the whole thing with "If the server wishes," so I suggest that you just change "MAY send" to "sends" (and change "set" to "sets").  (I also think the second "MAY" isn't necessary, but at least it isn't wrong.)

-- Section 3.8 --
Nit: I think you don't need a comma after "sub-protocol" (but you do need commas both before and after "as such").

In the second paragraph, "the use may be used" needs rewording.  Just delete "The use of" to fix that.

-- Sction 3.10 --
The passive voice here leaves a question open: Can either the client or the server initiate this?  Or does it have be done by the client?  It would be good to put it in active voice, I think, as "In order to alleviate the problems of temporary disconnections, the client MAY use the XMPP Stream Management extension...."  And similarly for the second paragraph.

-- Section 6 --
Nit: The last paragraph is missing a closing parenthesis after "[RFC6455]".

(Ted Lemon) (was Discuss) No Objection

Comment (2014-09-11)
No email
send info
All of my DISCUSSes and comments have been addressed.   I include the discuss and comments below for future reference, but no further work need be done to address them.

Former DISCUSS:
In the security considerations section, it would help to discuss how the security model possible using websockets compares to the security model available for regular XMPP.   I find the lack of any discussion of this frustrating, but don't know enough about websockets to be able to describe the incongruity that seems to exist here.   The action item for this DISCUSS would be either to add some text discussing this.   I realize that that's vague, and so this is subject to negotiation on the call or by email—it's not my goal to hold up the document on this, just to see if it's possible to get more clarity.

The thing that leads me to worry about this is the inability of the client to actually know who it is talking to; the current text that talks about web host metadata (second paragraph) is useful, but leaves me wanting a bit more detailed discussion.

Aside from this and the comments below, the document is very clear and easy to follow.   Thanks for doing it!

Former comments:

I agree with Pete's comment.

In section 3.4, the example response does not include a "to" attribute as required by RFC 6120 section 4.7.2.   Am I missing something?

In section 3.5, are we sure that there are no connection initiation requests that could result in an error that would make it impossible to send a second frame?   Also, what does the client do if it receives a badly-formed open response, or if it receives something other than an open in response to an open?

In section 3.7, no reason is given for a stream restart being mandated.   Can you add a reference here (I assume this is described in detail in RFC 6120)?

In 3.8, suggest the following rewording:
OLD:
   The use of either of these extensions (or both) MAY be used to
   determine the state of the connection.
NEW:
   Either of these extensions (or both) MAY be used to
   determine the state of the connection.

Similarly in section 4:
OLD:
   Use of web-host metadata MAY be used to establish trust between the
   XMPP server domain and the WebSocket endpoint, particularly in multi-
   tenant situations where the same WebSocket endpoint is serving
   multiple XMPP domains.
NEW:
   Web-host metadata MAY be used to establish trust between the
   XMPP server domain and the WebSocket endpoint, particularly in multi-
   tenant situations where the same WebSocket endpoint is serving
   multiple XMPP domains.

(Kathleen Moriarty) No Objection

Comment (2014-07-09 for -07)
No email
send info
Thanks for addressing the SecDir comments: http://www.ietf.org/mail-archive/web/secdir/current/msg04891.html

Found a nit:
Section 3.9:  There is a word or two missing in the following sentence:
   Instead,
   when TLS is used, it MUST be enabled the WebSocket layer using secure
   WebSocket connections via the |wss| URI scheme.  (See Section 10.6 of
   [RFC6455].)

(Pete Resnick) No Objection

Comment (2014-07-08 for -07)
No email
send info
Just a comment, not a showstopper by any means:

Some of the MUSTs in this document seem kind of goofy. When I go to use a MUST, I usually ask myself, "What else could an implementer possibly do?", and if the answer is "If they don't do it, they're not implementing this protocol", then there's no need for the MUST. For example:

3.1:

   During the WebSocket handshake, the client MUST include the
   value |xmpp| in the list of protocols for the |Sec-WebSocket-
   Protocol| header.  The reply from the server MUST also contain |xmpp|
   in its own |Sec-WebSocket-Protocol| header in order for an XMPP sub-
   protocol connection to be established.

What else would an implementer do? Instead, try:

   In order to establish an XMPP sub-protocol connection,  during the
   WebScoket handshake, the client includes the value |xmpp| in the
   list of protocols for the |Sec-WebSocket-Protocol| header, and the
   server includes |xmpp| in its own |Sec-WebSocket-Protocol| header in
   the reply.

There are other examples of these sorts of uses in the document. On the flip side, it is useful to give requirements on the receiving side, like "An implementation MUST reject with an error any frame that does not begin with a '<'". An implementation might not think to do that, and it's important.

The world doesn't end if you don't fix these up; that's why this is only a comment. Implementers will probably figure out that if the spec says, "Foobar MUST be X", they should probably reject foobars that aren't X. But I do think it would help implementers if you used MUSTs where an implementer might get themselves in trouble, not to define some sort of "conformance criteria". I think it's worth having a run through the document and convince yourselves where these are and aren't helpful uses of the term.

(Martin Stiemerling) No Objection