Skip to main content

Real-Time Streaming Protocol Version 2.0
draft-ietf-mmusic-rfc2326bis-40

Yes

(Gonzalo Camarillo)

No Objection

(Jari Arkko)
(Richard Barnes)

Abstain


Recuse

(Martin Stiemerling)

No Record


Note: This ballot was opened for revision 38 and is now closed.

Gonzalo Camarillo Former IESG member
Yes
Yes (for -38) Unknown

                            
Adrian Farrel Former IESG member
No Objection
No Objection (2013-12-02 for -38) Unknown
Looks to me like Barry may have reviewed some of the text. I looked and found nothing threatening to routing so I will ballot No Objection.  I did note the following...

In Section 18.21 you have some ambiguity in the instruction to the RFC
Editor. It reads...

      Note: The RTSP 2.0 date format is defined to be the RFC 5322 full
      date format.  This format is more flexible than the RFC 1123 date
      format used by RTSP 1.0.  Thus implementations should use single
      spaces as recommended by RFC 5322 as separators and support
      receiving the obsolete format.

      [[RFC Editor please remove this note: Prior to version 37 of the
      draft, rfc2326bis envisaged sticking with the RFC 1123 format.]]

I think you only want the Editor to remove the note in double square
brackets. But placed where it is, the editor will think they have to 
remove the whole Note, which I doubt is what you want.
Barry Leiba Former IESG member
(was Discuss) No Objection
No Objection (2014-01-22 for -39) Unknown
UPDATE:
Version -39 addresses my DISCUSS points, and most of my COMMENTs -- THANKS!

I still don't see an RFC Editor note, so I'm leaving that comment.  Also, below I'm leaving my comments that I see have not been addressed.  Maybe that was intentional, but maybe, with all the comments, that was an oversight.  They're all non-blocking, but I do think they're changes that should be made.  I'll say no more about them, though.  (I've also added comments for sections 15 and 16, which I hadn't reviewed yet when I posted my other comments.)

I have not reviewed beyond Section 16, and I had hoped to do so.  That said, it's not fair for me to hold anything up for that, especially as I have a lot of other things going on.  That does sort of get into Pete's comment, that the document is too long to get thorough and complete review.  Waddyagonnado?

========================================================

First, a comment for the responsible AD:
The grammar, punctuation, and English usage in Section 2 and its subsections is
at times very hard to sort through.  I'm going to call out in the comments some
bits that I found particularly troublesome, and will try to suggest
alternatives.  I also suggest that the responsible AD put in an RFC Editor note
asking the editors to pay particular attention here and to give it some heavy
editing for language clarity.  This is a complex document, and a good overview
in clear English will really help.  Here's a suggestion for text for an RFC
Editor note:
------------------------------
RFC Editor note:
Section 2 of this document and its subsections provide "an informative overview
of the different mechanisms in the RTSP 2.0 protocol, to give the reader a high
level understanding".  As such, it's very important that it be clear and well
written.  Some reviewers have found the English to be difficult.  Please take
an especially critical pen to this section, making sure that the grammar and
usage are correct, and that the language is clear.
------------------------------

Comments that have not been addressed in -39:

-- Section 4.7.2 --

   The following different media retention policies
   are envisioned and taken into consideration where applicable:

What is that meant to be saying, really?  Please rephrase it (I have no idea
what "and taken into consideration where applicable" might mean).

   Time-Duration:  Each individual unit of the media will be retained
      for the specified duration.

What does "each individual unit of the media" mean?

-- Section 4.7.3 --

   Dynamic:  Between explicit updates the media content will not change,
      but the content may change due to external methods or triggers,
      such as playlists.

This sounds like doubletalk: it won't change, but it may change.  I think it
needs re-wording.

-- Section 4.7.5 --
It appears that you're mixing example headings (which contain colons) on the
same line as header fields (which contain colons), making the whole thing
unclear.  I suggest putting the example title on its own line, then starting
the example on a new line, like this:

   Example of "On-demand":
      Random Access: Random-Access=5.0, Content Modifications:
      Immutable, Retention: Unlimited or Time-Limited.

-- Section 5.1 --

   In the interest of robustness, agents MUST ignore any empty line(s)
   received where a Request-Line or Response-Line is expected.  In other
   words, if the agent is reading the protocol stream at the beginning
   of a message and receives a CRLF first, it MUST ignore the CRLF.

What's a "Response-Line"?  I also presume the last sentence above is applicable
recursively (you want to ignore any number of leading CRLF sequences), yes?
 Might be good to be clear about that.

-- Section 9.3 --

HTTP and email have both historically had horrendous problems with inaccurate
or missing Content-Type headers.  It's good that you use MUST for Content-Type;
that has to help.  I would be happier (though not at DISCUSS level) if you
should say a few more words about Content-Type here: something about how it
MUST accurately specify the type and subtype of the content, that there are no
defaults for type or subtype, and that attempts to use heuristics to
second-guess this are Very Bad.

-- Section 10.2 --

   The scheme of the RTSP URI (Section 4.2)
   indicates the default port that the server will listen on if the port
   is not explicitly given.

The URI is created by the client, not the server, so it doesn't specify what
port the server "will listen on."  It specifies the port the client will
contact the server on.  I would say it this way:

NEW
   The scheme of the RTSP URI (Section 4.2)
   allows the client to specify the port it will contact the server
   on, and defines the default port to use if one is not explicitly
   given.
END

   This port may
   provide some benefits from non-registered ports if a RTSP server is
   unable to use the default ports.

I think you mean "some benefits over non-registered ports".  "Over", not "from".

      authorize a client establishing a new connection as being a
      legitimate receiver of a request related to a particular RTSP
      session without the client first issuing requests related to the
      request.

There's a lot of "request" in there, and it's confusing.  Maybe this?:

NEW
      authorize a client establishing a new connection as being a
      legitimate receiver of a request related to an existing RTSP
      session, without the client first issuing new requests related
      to the pending request.
END

   To avoid this problem an RTSP agent
   is recommended to buffer incoming messages locally so that any
   response messages can be processed immediately upon reception.

Thus doesn't make sense to me.  If you're buffering them, you're not processing
each one immediately, right?  You won't process a message until it has its turn
to be popped out of the buffer.  What are you really trying to say here?

-- Section 10.7 --

   However, this can cause the situation where multiple RTSP clients
   again send requests to a proxy or server at roughly the same time
   which may again cause an overload situation, or if the "old" overload
   situation is not yet solved, i.e., the length indicated in the Retry-
   After header was too short.

I can't parse this (in particular, I can't make sense of what comes after ", or
if"); please try re-wording, probably splitting the too-long sentence in the
process.

   A more complex case may arise when a load balancing RTSP proxy is in
   use, i.e., where an RTSP proxy is used to select amongst a set of
   RTSP servers to handle the requests, or when multiple server
   addresses are available for a given server name.

Here's an example of a problem with using "i.e.": I can't tell what the scope
of the "i.e." is.  Is it the rest of the sentence ["X (i.e., A, or B)"], or is
it just the first part ["X (i.e., A), or B"] ?  Please clarify the text.

   It is REQUIRED to not set the
   same value in the timer for each scheduling, but instead to add a
   variation to the mean value, resulting in picking a random value
   within the range of 0.5 to 1.5 of the mean value.

Is this meant to say "0.5 to 1.5 *times* the mean value?  "Of" doesn't do that.

-- Section 13.5 --

   PLAY_NOTIFY requests MAY be used with a message body, depending on
   the value of the Notify-Reason header.  It is described in the
   particular section for each Notify-Reason if a message body is used.
   However, currently there is no Notify-Reason that allows using a
   message body.

That last sentence should be a clue that the "MAY" here is wrong.  You're not
describing an option for the implementor.  Depending upon the value of
Notify-Reason, either a body is needed or it isn't.  You can make this "may",
or, better, "will sometimes".

-- Section 15.1 --

   However, it is important to consider if this header
   and its function is required to be understood by the proxy or can be
   forwarded.  If the header needs to be understood, a feature-tag
   representing the functionality MUST be included in the Proxy-Require
   header.  Below are guidelines for analysis if the header needs to be
   understood.

In the first sentence, I think you mean, "or can simply be forwarded."  That
sense is lost without the word "simply".  In the last sentence, the guidelines
aren't for application only if the header need to be understood; they're
guidelines to determine that.  So, "Below are guidelines for analyzing
whether the header needs to be understood."

   The transport header and its parameters also shows that
   headers that are extensible and require correct interpretation in the
   proxy also require handling rules.

I don't understand this sentence at all.  Can you try re-wording it?

   Transport modifying:  The access and the security proxy both need to
      understand how the transport is performed, either for opening
      pinholes or to translate the outer headers, e.g., IP and UDP.

RTSP 2.0 isn't using UDP, right?  So what's that about?

-- Section 16.1.1 --

   In simple terms, a cache entry is considered to be
   valid if the content has not been modified since the Last-Modified
   value.

Strikes me as too simple, and wrong.  It's not possible for either the streamed
or the cached data to have been modified since the Last-Modified value.
What I think you mean to say is, "if the cache entry was created after the
Last-Modified time."
Brian Haberman Former IESG member
No Objection
No Objection (2014-02-03 for -39) Unknown
I managed to slog through this beast and am balloting No-Obj on the basis that I do not see any functionality that impacts Internet Area protocols.  I fully support the comments by other ADs that:

1. it seems improbable that two independent implementations would interoperate

2. the size and complexity of the spec makes it nigh unreadable
Jari Arkko Former IESG member
No Objection
No Objection (for -38) Unknown

                            
Joel Jaeggli Former IESG member
No Objection
No Objection (2013-12-02 for -38) Unknown
I have reviewed this to the extent that I think reasonable. and I have few misgivings about the text that aren't editorial.

I have a serious concern about the size and structure of the document.

By my count it took something like 11 years to write this. I think is is rather problematic to attempt to be comprehensive within the scope of single document. Parts of this, the IANA regististries, the security model section, the header defintions, examples, use of SDP, and so on should be separate documents. 

We should not be repeating this experience, either from an editing, implementation or registry management standpoint.
Richard Barnes Former IESG member
No Objection
No Objection (for -38) Unknown

                            
Spencer Dawkins Former IESG member
No Objection
No Objection (2013-12-04 for -38) Unknown
Just a couple of questions about section 10 ... 

In this text: 10.  Connections

   RFC 2326 attempted to specify an optional mechanism for transmitting
   RTSP messages in connectionless mode over a transport protocol such
   as UDP.  However, it was not specified in sufficient detail to allow
   for interoperable implementations.  In an attempt to reduce
   complexity and scope, and due to lack of interest, RTSP 2.0 does not
   attempt to define a mechanism for supporting RTSP over UDP or other
   connectionless transport protocols.  A side-effect of this is that
   RTSP requests MUST NOT be sent to multicast groups since no
   connection can be established with a specific receiver in multicast
   environments.

Is this the right way to say this? (If you try to open a multicast connection to a TCP address, do you get far enough to "send an RTSP request"?) 

In this text: 10.3.  Closing Connections

   A requester SHOULD wait at least 10 seconds for a response before
   concluding that the responder will not be responding to its request.
   After receiving a 100 response, the requester SHOULD continue waiting
   for further responses.  If more than 10 seconds elapses without
   receiving any response, the requester MAY assume that the responder
   is unresponsive and abort the connection by closing the TCP
   connection.

Would it be helpful to say anything about not immediately opening a new connection to the same unresponsive responder?
Stephen Farrell Former IESG member
(was Discuss) No Objection
No Objection (2014-01-28 for -39) Unknown

-- These used to be discuss points

(4) 19 - Why is there no equivalent of HTTP CONNECT for
TLS?  It seems like the choices are to either connect
directly over TLS to the origin server or else to have to
use a proxy that sees all the plaintext and headers.

(5) 19.2 - 2nd last para: Why don't you use SNI here? Just
wondering, but it'd fix a problem if it worked.

The authors response was that they hadn't been considered
and its too late in the day now but that they could be
investigated further later if needed. 

I think that's a shame and means that we're consciously
not doing as well at security as we might. However, I also
accept that this draft has been on the go since 2002 already
so its not likely to undergo any radical change now.

if there's a way to add a mention of these points as 
possible future work that'd be niice.


-- old coments

- If a [HX.Y] reference to 2616 has been updated or
changed by httpbis (which is on the telechat agenda after
this one) then is it correct for RTSP to still refer to
2616? I could imagine that the answer might vary in each
case.  Has someone checked if any such discrepencies exist
between this, 2616 and httpbis? (I could understand that
that'd have been an unreasonable question right up until
now when httpbis has completed IETF LC.)

- 2.1 - is this versioning scheme really still valid?  It
doesn't sound like it. If its not, then it might be good
to say that so folks don't write code that assumes things
will work like this in future.

- 4.3 - I don't believe you can confirm that something
"MUST be chosen cryptographically random" since it could
always be "Ri=E(k,Ri-1)" which will look but not be
random.

- 5.2 - Its probably defined somewhere but is there a way
to make a very long header field, e.g. like a DKIM
signature?

- 8.2 - I don't understand how its safe to treat an
unrecognised response header as a message body header.
That seems hacky and likely to encourage hacks. And I
don't recall that being said for request header fields.

- 10.2 - what are classifiers in n/w monitoring tools?

- 10.4 - typo: "taking long time" 

- 10.4 - the "agent SHOULD establish a new connection" as
a mitigation makes no sense to me - how would a server
(say) know where to connect to a proxy? (Which might be
behind a CGN who knows.)

- 10.5 - doesn't this assume that the RTSP and RTCP/RTP
server sides can communicate, but how can a client know
that?

- 10.5 - is "or when using reliable protocols" still
needed?

- 11.1 - 1st para: I've no idea what the last sentence
there is meant to mean.

- 13.3.1 - if I can change my transport parameters isn't
that a potential DoS vector?

- 13.9 - is there any chance someone will extend
SET_PARAMETER in a dangerous way? e.g. to allow:
C->S: SET_PARAMETER rtsp://eg.com/../etc/passwd RTSP/2.0 
      ...  
      root:x:0:0:root:/root:/bin/bash

- 13.9 - pity you'd used 451 already - Tim Bray's HTTP
451 is cute!

- 13.10 - is REDIRECT all ok with authentication? e.g.
there's no statement that the same dictionary attackable
equivalent for HTTP Basic or Digest MUST be sent if the
server redirects, etc.

- 13.10 -  I don't get what it'd mean for a client to send
a REDIRECT or does "The proxy is responsible for accepting
REDIRECT responses from its clients" mean something else?

- 14 - interleaved binary data, hmmm - buffer overrun
thoughts, lots of lovely complexity...

- 18.4.31-34 - I don't get the 466/470/471 errors, can you
explain?

- 18.7 - again, maybe reference httpbis?

- 18.10 - You don't say whether (D)TLS application PDU
padding is included or not, nor if this can be sent over
TCP.

- 20 - I skipped the ABNF sorry:-)

- 21.2.1: I don't see how "Thus, an RTSP server MUST only
allow client-specified destinations for RTSP-initiated
traffic flows if the server has ensured that the specified
destination address accepts receiving media through
different security mechanisms" can be implemented really.

- 21.2.1 - Doesn't that make [I-D.ietf-mmusic-rtsp-nat]
normative as you're saying its a solution for remote DoS?
Stewart Bryant Former IESG member
No Objection
No Objection (2013-11-20 for -38) Unknown
I have reviewed this text solely looking for impact on the routing system and see no such impact.
Pete Resnick Former IESG member
Abstain
Abstain (2013-12-04 for -38) Unknown
I have reviewed as much as I reasonably could.

This first comment is more for the IESG than the authors; I don't think the authors can really do anything about this now:

I have a really hard time believing that a document of this size could possibly be independently implementable. There is simply no way that this document got the kind of review by enough eyeballs to be able to claim it is "generally stable, has resolved known design choices, is believed to be well-understood, has received significant community review, and appears to enjoy enough community interest to be considered valuable." And the fact that there is only one implementation of this does not give me confidence. Robert's GenArt review makes it clear that, due to the length and structure of the document, he has concerns about interoperable implementation. Elwyn did  a yeoman's job in his GenArt review, and though he is satisfied that his concerns were addressed, there are still many significant issues: Barry also did a detailed review and has found several problems, and I've got more below. We know that secdir did not do a comprehensive review of this document. I have done a review of this spec at high speed, and therefore at a completely cursory level, and I know that other ADs are in the same boat. I feel that spending the time that would really be needed for a full review would be unfair to other important documents. I don't know how this thing was left to continue for 11 years, to go from an already hefty 90 page document at -00 to this 320 page monstrosity, without it being stopped earlier and at least broken into pieces.  Even the document writeup indicates that later versions have not seen WG participation, that the authors have really been the only ones tracking it recently, and that the only thing that can be claimed is "no indication of lack of consensus"; that's not exactly a ringing endorsement. I don't know how, in good conscience, the IESG can move this document forward as representing the consensus of the WG, let alone the consensus of the IETF as a whole. It seems to me that the process failed long ago for this document. And I don't know what, if anything, we can do about it at this point.

Substantive comments:

Section 1:

Strike, ", similar to the remote control for a DVD player" and ", as known from DVD players remote control or media players". They're anachronistic.

In the discussion of changes from 1.0, I would mention that greenfield implementations should use 2.0 since 1.0 is incomplete.

Section 2.6: "however, it is recommended to release the session context". The client or the server? I hope "it is recommended that the client release the session context". I hate when servers tear down sessions even when I am sending a keep-alive. The server may certainly tear it down if it needs the resources, but the document shouldn't recommend that it be torn down.

Section 10.2:

   The RTSP agent SHALL NOT use more than one
   connection per RTSP session at any given point.

The explanation given later does not justify the SHALL NOT. It certainly is useful for the server to only use a single connection, but I see no justification for the REQUIREment.

   A server that attempts to send a
   request to a client that has no connection currently to the server
   SHALL discard the request directly.

I don't know what the word "directly" means here. Can you simply strike it, or is it meaningful?

Sections 14-16: These seem more like appendices or a separate document rather than part of the main protocol.

Section 16: This section uses the "we" convention from academic papers. This is distracting and a strange affectation. Please change to "This document", or in some cases simply rewrite the sentence.

Section 17.1: Title should be "Continue", not "Success".

Section 17.3: '3rr' is used to distinguish it from HTTP usage of '3xx'? If so, you should say that. Don't all 3rr responses require a Location? If so, put that instruction here, not in 17.3.3. Also, the generic instruction to use 302 for unknown 3rr is probably better here rather than in 17.3.1.

Section 18.20: "The initial sequence number MAY be any number, however, it is RECOMMENDED to start at 0." That MAY is wrong. Change to "can".

Section 18.21: You say that you are using "a full date as specified by Section 3.3 of [RFC5322]". I presume that you are *not* including the obsolete syntax. You should probably say that explicitly. However, the 5322 3.3 syntax without the obsolete forms does not allow for the alphabetic time zones like "GMT". It only permits the numeric time zones. If you really want to do that, you should change the examples in throughout the document to "+0000". Otherwise, you could explicitly allow obs-zone, but not the rest. Also, be aware that 5322 3.3 syntax is allowed to end with CFWS. Are you OK with that?

(Note again: I did not scrub the ABNF. These are only the things I found from a quick review.)

Section 20.1: Is there a reason for not just importing the 5234 core rules?

   UTF8-NONASCII    = UTF8-1 / UTF8-2 / UTF8-3 / UTF8-4
   UTF8-1           = <As defined in RFC 3629>

That's wrong. You want to strike UTF8-1 and just say: "UTF8-NONASCII = UTF8-2 / UTF8-3 / UTF8-4"

UTF8-CONT is unnecessary. You should use UTF8-tail, which is defined in 3629. The only two places it appears are in header-value (more on that below) and in Appendix F, which should be using the UTF8-NONASCII syntax from here, not re-creating it.

Section 20.2.1: Do you really want header-value to have UTF8-CONT freely distributed? That seems wrong. I think UTF8-CONT probably should be struck. (If you want to allow any octet, not part of a UTF-8 sequence, and not ASCII controls, use TEXT instead.)

Section 22: I find 2119 in IANA Considerations sections problematic. There shouldn't be 2119 requirements on IANA, and they are silly uses of the words when applied to registrants or expert reviewers; in all cases, they simply allow this document to avoid saying who is responsible for enforcing the rule. Please get rid of them throughout this section and its subsections, and make it clear whether "IANA needs to collect the following information" or "Section XYZ of this document requires that the entry is of such and so format" or "Registrants are asked to do blah blah blah" or "Expert Reviewers should confirm that the following information is in the registration", etc.

Specific changes in the main Section 22:

OLD
   it MUST follow the procedure
NEW
   registrants need to follow the procedure

OLD
   A registration request to IANA MUST contain the following
   information:
NEW
   IANA needs to obtain the following information for any
   registration request:
(This one is especially silly because some of the items are not actually required.)

Section 22.1.2:

- Capitalize "first come, first served", and perhaps cite 5226.
- You cannot have a SHOULD on the length of the name on a FCFS registry. IANA has no way to decide exceptions to such a rule. Either make a maximum length, or don't.
- The reference to "first part" of the feature-tag doesn't make sense as syntactically feature-tags don't have parts.

I suggest something like this as a rewrite for the second paragraph:

   The registry entry for a feature-tag has the following information:
   
      - The name of the feature-tag
         - If the registrant indicates that the feature is proprietary,
           IANA should request a vendor "prefix" portion of the name.
           The name will then be the vendor prefix followed by a "."
           followed by the rest of the provided feature name.
         - If the feature is not proprietary, then IANA need not
           collect a prefix for the name.
      - A one paragraph description of what the feature-tag represents
      - The applicability (server, client, proxy, or some combination)
      - A reference to a specification, if applicable
   
   Feature-tag names (including the vendor prefix) may contain any
   non-space and non-control characters. There is no length limit
   on feature-tags, though registrants may want to limit their
   length to twenty characters because...? [Etc.]

Section 22.2.2: Strike the MUSTs and the SHOULD. The first sentence should simply be "The registration policy for new RTSP methods is Standards Action [RFC5226]."

Please have a go of fixing these kinds of issues throughout Section 22.

Appendix F:

OLD
   TEXT-UTF8char    =  %x21-7E / UTF8-NONASCII
   UTF8-NONASCII    =  %xC0-DF 1UTF8-CONT
                    /  %xE0-EF 2UTF8-CONT
                    /  %xF0-F7 3UTF8-CONT
                    /  %xF8-FB 4UTF8-CONT
                    /  %xFC-FD 5UTF8-CONT
   UTF8-CONT        =  %x80-BF
NEW
   TEXT-UTF8char    =  <See section 20.1>
Martin Stiemerling Former IESG member
Recuse
Recuse (for -38) Unknown

                            
Benoît Claise Former IESG member
No Record
No Record (2013-12-05 for -38) Unknown
So I went for "no record".  There are three reasons for that.
1. I want to stress that having a 322 pages document is not practical. (Did I say ridiculous, no? ... Maybe I'm thinking too loud...)
At some point in time, the IESG might have to think about limiting the number of pages an AD has to read as preparation for the IESG telechat, and not the number of documents.
For the same amount of pages (read time spent), I personally preferred to read all the other documents, and improve their quality, as opposed to review a single document. And yes, I spend 100% of my time on this AD job.

2. I did a very high level OPS check, basically doing OPS keyword searching.
When I compare the time spent versus the number of pages, this is so minimal that it can't be qualified as a review. I understand that the IESG shouldn't be the bottleneck, but simply selecting "no objection" while very little review has been done is not the right way IMHO.

3. Finally, I could not find an OPS-DIR reviewer. Guess why?

So I prefer to clearly express that I have not done a review for this document.

This feedback is not actionable by the authors.