Skip to main content

Network Transport Circuit Breakers
draft-ietf-tsvwg-circuit-breaker-15

Yes

(Alissa Cooper)
(Martin Stiemerling)
(Spencer Dawkins)

No Objection

(Alia Atlas)
(Alvaro Retana)
(Benoît Claise)
(Brian Haberman)
(Jari Arkko)
(Terry Manderson)

Note: This ballot was opened for revision 13 and is now closed.

Alissa Cooper Former IESG member
Yes
Yes (for -13) Unknown

                            
Martin Stiemerling Former IESG member
Yes
Yes (for -13) Unknown

                            
Spencer Dawkins Former IESG member
Yes
Yes (for -13) Unknown

                            
Alia Atlas Former IESG member
No Objection
No Objection (for -13) Unknown

                            
Alvaro Retana Former IESG member
No Objection
No Objection (for -13) Unknown

                            
Barry Leiba Former IESG member
No Objection
No Objection (2016-03-16 for -13) Unknown
I agree with Ben's DISCUSS point.
Ben Campbell Former IESG member
(was Discuss) No Objection
No Objection (2016-03-16 for -13) Unknown
= Substantive: =

- General:

-- [I moved the following here from my original DISCUSS position. I've cleared the DISCUSS, since "discussion" has started and looks promising.]  The rule 18 "out of band" section says loss of control messages SHOULD NOT trigger the CB.  But rule 1 said "The CB MUST trigger if there is a failure of the
        communication path used for the control messages." These seem to be contradictory normative requirements.

-- I'm surprised not to see much, if any, commentary about how endpoints are expected to learn of and react to triggered CBs. Flows that just stop working for no apparent reason will violate the principle of least surprise for users. Users are likely to do exactly the wrong things (e.g. restart flows), unless they are informed of why.

- 1, last paragraph: 

-4, req 10:

What is meant be default here? Does this suggest that an implementation must disable all traffic unless a user explicitly configures it to behave differently?

- req 12: "... MUST be much more severe"
How do you measure that sort of requirement? Isn’t this already covered by 10 and 11 and 13?

- 5.1, last paragraph, last sentence:

Does this imply that circuit breakers can/will be stacked? If so, an explicit mention of that fact early in the document would be helpful.

- 5.1.1:
Why not reference draft-ietf-avtcore-rtp-circuit-breakers ? (Informational should be fine.)

= Editorial: =

-1, third paragraph: "Avoiding persistent excessive prevention ..."

Should that be "Avoiding persistent excessive _congestion_ ..."?

-- 5th paragraph, first sentence:

Is there a such thing as “normal excessive congestion?”

-- 8th paragraph: "This is to
   ensure that a Circuit Breaker does not accidentally trigger following
   a single (or even successive) congestion events (congestion events
   trigger transport congestion control, and are to be regarded as
   normal on a network link operating near capacity). "

I’m confused by this sentence. What is the subject of “are to be regarded as normal”?

-- 10th paragraph, first sentence: Is this the same as saying that circuit breakers should not trigger under normal conditions?

- 2nd to last paragraph, 2nd sentence:
In contrast to what? Not knowing the cause of congestion? (Are contestion-controlled protocols expected to know the cause?)

-4, req 3: It would have been helpful to mention ECN (or forward reference it) prior to the first mention of "lost/marked packets".

-- req 9: Isn’t this the point behind the last 3 (or more) requirements? That is, it seems like this is the real requirement and those were more implementation details.

- 5.1, first paragraph, 2nd sentence:
The sentence structure makes it look like you are using (TCP, SCTP, DCCP) as examples of "applications that do not use a full-fledged transport", which is  obviously not the intent.
Benoît Claise Former IESG member
No Objection
No Objection (for -13) Unknown

                            
Brian Haberman Former IESG member
(was Discuss) No Objection
No Objection (for -13) Unknown

                            
Deborah Brungard Former IESG member
No Objection
No Objection (2016-03-16 for -13) Unknown
Agree with Ben's Discuss point. Also Requirement 18 says for in-band "SHOULD" and "ought to" vs. Requirement 1's "MUST". Would prefer Requirement 1 be clarified by combining with Requirement 18. Prefer 18's requirement that an out-of-band control channel failure does not trigger the CB and disrupting traffic. May want to consider adding the ability to configure if a loss of the control channel should trigger the CB. Other technologies consider the loss of control protocol as a freeze on the bridge selector state. Also may want to consider adding the ability to configure a freeze of the CB for maintenance/operations.
Jari Arkko Former IESG member
No Objection
No Objection (for -13) Unknown

                            
Joel Jaeggli Former IESG member
No Objection
No Objection (2016-03-13 for -13) Unknown
linda.dunbar@huawei.com

performed the opsdir review

   Simple protection can be provided by using a randomized source port,
   or equivalent field in the packet header (such as the RTP SSRC value
   and the RTP sequence number) expected not to be known to an off-path
   attacker.  Stronger protection can be achieved using a secure
   authentication protocol.  This attack is relatively easy for an on-
   path attacker when the messages are neither encrypted nor
   authenticated.  When there is a risk of on-path attack, a
   cryptographic authentication mechanism for all control/measurement
   messages is RECOMMENDED to mitigate this concern.

By on-path attacker we mean service provider.

As with for example HLS (which is congestion controlled), and which ISPs particularly wireless carriers then deliberately degrade; Providing a protocol with a well understood mechanism for shutting it down can be used to do so maliciously without imposing a strict firewall policy. that's convenient for plausible deniability.  I wonder if the technical response from implementors isn't to wrap the flow in something harder to inspect.

The discussion of a multicast circuit breaker seems largely anchored in SSM with respect to there being one (reliable) sender. In the ASM case it seems trivial for a malicious sender to cause all the other parties to leave the group by misreporting it's own sending properties (and do so with very few packets).
Stephen Farrell Former IESG member
No Objection
No Objection (2016-03-16 for -13) Unknown
- I agree with Ben's discuss.

- Do the MPLS folks and similar agree with 6.1? If so, great.
(And how did you figure that out?) If not, doesn't that make a
big part of this BCP mythical?  (Which would seem
undesirable.)

- 6.2, 2nd para: what question?

- Section 7 (and earlier): You RECOMMEND a crypto mechanism to
mitigate possible DoS. In both cases however, the statement is
ambiguous. Are you RECOMMENDing a mechanism be defined or that
one be used? (And of course if you asked me, I'd say that it'd
be better to a crypto mechanism MUST be used, even when
off-path attacks seem unlikely to work.)
Terry Manderson Former IESG member
No Objection
No Objection (for -13) Unknown