Skip to main content

Telechat Review of draft-ietf-bmwg-ngfw-performance-13
review-ietf-bmwg-ngfw-performance-13-tsvart-telechat-pauly-2022-05-24-00

Request Review of draft-ietf-bmwg-ngfw-performance-13
Requested revision 13 (document currently at 15)
Type Telechat Review
Team Transport Area Review Team (tsvart)
Deadline 2022-05-31
Requested 2022-05-09
Requested by Al Morton
Authors Balamuhunthan Balarajah , Carsten Rossenhoevel , Brian Monkman
I-D last updated 2022-05-24
Completed reviews Secdir Early review of -00 by Kathleen Moriarty (diff)
Tsvart Last Call review of -12 by Tommy Pauly (diff)
Tsvart Telechat review of -13 by Tommy Pauly (diff)
Iotdir Telechat review of -13 by Toerless Eckert (diff)
Genart Telechat review of -13 by Matt Joras (diff)
Tsvart Telechat review of -13 by Tommy Pauly (diff)
Comments
This is a POST telechat review request.
Lars Eggert's ballot is a DISCUSS, and in follow-up exchange Lars requests further TSV review of the draft.
Tommy Pauly has reviewed previous versions of the draft.
https://datatracker.ietf.org/doc/draft-ietf-bmwg-ngfw-performance/

See some post-ballot TCP exchange below. *The key question seems to be _where_ to draw the line between completely modern/RFC-compliant TCP Stacks, and the requirements for stacks in test equipment that will operate 1000's of TCP connections*

> Hi Carsten, 
...
> let me try and provide a bit more context on the issues you flagged:
> 
> On 2022-2-3, at 16:09, Carsten Rossenhoevel <cross@eantc.de> wrote:
> > - "There are a lot of requirements in here that are either no-ops [..], non-
> sensical [..]"
> 
> That was in reference to the TCP paragraph. Specifically:
> 
> * "SHOULD use a congestion control algorithm" is superfluous (= a no-op),
> because the TCP standard already requires that (at MUST level)
> 
> * the delay for delayed ACKs is typically specified as a time interval; the
> document suggests to compute it as a byte size ("10 times the MSS"), which
> does not immediately make sense
> 
> * "Internal timeout SHOULD be dynamically scalable per RFC 793." It is unclear
> what is meant by the "internal timeout" or how it would dynamically scale. The
> reference to RFC 793 does not provide clarity here.
> 
> > "This document needs TSV and ART people to help with straightening out a
> lot of issues"
> 
> I was writing this is a hint to the WG on where to go for help with addressing
> these issues.

> > - "The document is also giving unnecessarily detailed behavioral
> descriptions [..]"
> 
> One reason I flagged this is because these details rule out existing (e.g.,
> fast open) or future TCP extensions. Another reason is that they are
> inaccurate, because although they are detailed they still do not capture all
> possible permitted behaviors (e.g., connection-close packet exchanges.)
> Thanks,
> Lars
Assignment Reviewer Tommy Pauly
State Completed
Request Telechat review on draft-ietf-bmwg-ngfw-performance by Transport Area Review Team Assigned
Posted at https://mailarchive.ietf.org/arch/msg/tsv-art/7e9daGl-IcJTEiVodsrdJ-b8xhk
Reviewed revision 13 (document currently at 15)
Result Almost ready
Completed 2022-05-24
review-ietf-bmwg-ngfw-performance-13-tsvart-telechat-pauly-2022-05-24-00
This document has been reviewed as part of the transport area review team's
ongoing effort to review key IETF documents. These comments were written
primarily for the transport area directors, but are copied to the document's
authors and WG to allow them to address any issues raised and also to the IETF
discussion list for information.

When done at the time of IETF Last Call, the authors should consider this
review as part of the last-call comments they receive. Please always CC
tsv-art@ietf.org if you reply to or forward this review.

I agree with the points Lars made about this document being too specific
and constrained with regards to TCP details, and not nearly as specific
with other protocol details. This was brought up in my initial TSVART
review, which I will quote here since it still applies:

"From a transport perspective, I’m concerned that some things are over-specified
(details of TCP implementations) and others are underspecified (how throughput
is measured, how loss and delay are tested)... I’d like to see transports 
(TCP/UDP/QUIC/other) be treated more consistently throughout the document,
particularly since non-TCP traffic will become increasingly relevant for the
devices these tests are targeting.

...

The client configuration section 4.3.1.1 details TCP stack configuration, but
does not address other transports. Discussing QUIC seems like it will be
relevant soon.

Overall, for this section, I am struck that there’s a lot of detail that seems
over-specified, with lots of normative language. For example, the TCP
connection MUST end with a three- or four-way handshake. What if there’s a RST?
I don’t understand what we’re requiring of these TCP implementations apart from
being a functional and compliant TCP implementation. How much of this is
actually required?"

Given the IESG reviews, I do agree this needs to be addressed before moving forward.
While we could spend a long time with transport area folks trying to fix the details
and flesh out equal levels of detail for QUIC and HTTP/1.1 / HTTP/2 / HTTP/3 configurations,
I don't think that is appropriate for this document.

My suggestion would be to strike the details about TCP entirely, particularly the
extraneous normative requirements. If your concern is how the test equipment will
behave with 1000s of connections, express that as a top-level requirement for any
transport; describe that the transports need to be tuned with common options to
ensure fairness and consistent use of the available bandwidth, etc. Getting at the
reasons will make it clearer. 

You also already say that "these are the defaults in most client operating systems".
Rather than duplicating what you currently believe are the defaults, just encourage
the use of defaults.