Ballot for draft-ietf-bmwg-ngfw-performance
Note: This ballot was opened for revision 13 and is now closed.
Warren Kumari Yes
Éric Vyncke (was Discuss) Yes
Thank you for the work put into this document and for addressing my previous DISCUSS and my previous COMMENT. They are kept below only for archiving purpose. Thanks to Toerless for his deep and detailed IoT directorate review, I have seen as well that the authors are engaged in email discussions on this review: https://datatracker.ietf.org/doc/review-ietf-bmwg-ngfw-performance-13-iotdir-telechat-eckert-2022-01-30/ Special thanks to Al Morton for the shepherd's write-up including the section about the WG consensus. I hope that this helps to improve the document, Regards, -éric # previous DISCUSS for archiving As noted in https://www.ietf.org/blog/handling-iesg-ballot-positions/, a DISCUSS ballot is a request to have a discussion on the following topics The document obsoletes RFC 3511, but it does not include any performance testing of IP fragmentation (which RFC 3511 did), which is AFAIK still a performance/evasion problem. What was the reason for this lack of IP fragmentation support ? At the bare minimum, there should be some text explaining why IP fragmentation can be ignored. # previous COMMENT for archiving One generic comment about the lack of testing with IPv6 extension headers as they usually reduce the performance (even for NGFW/NGIPS). There should be some words about this lack of testing. ## Section 4.1 Please always use "ARP/ND" rather than "ARP". ## Section 4.2 Any reason why "SSL" is used rather than "TLS" ? Suggest to replace "IP subnet" by "IP prefix". ## Section 126.96.36.199 (and other sections) "non routable Private IPv4 address ranges" unsure what it is ? RFC 1918 addresses are routable albeit private, or is it about link-local IPv4 address ? 169.254.0.0/16 or 198.18.0.0/15 ? ## Section 188.8.131.52 Suggest to add a date information (e.g., 2022) in the sentence "The above ciphers and keys were those commonly used enterprise grade encryption cipher suites for TLS 1.2". In "[RFC8446] defines the following cipher suites for use with TLS 1.3." is this about a SHOULD or a MUST ? ## Section 6.1 In "Results SHOULD resemble a pyramid in how it is reported" I have no clue how a report could resemble a pyramid. Explanations/descriptions are welcome in the text. ## Section 7.8.4 (and other sections) In "This test procedure MAY be repeated multiple times with different IP types (IPv4 only, IPv6 only and IPv4 and IPv6 mixed traffic distribution)" should it be a "SHOULD" rather than a "MAY" ?
Alvaro Retana No Objection
The datatracker should indicate that this document replaces draft-balarajah-bmwg-ngfw-performance.
Erik Kline No Objection
[throughout; comment] * In all sections describing Configuration Parameters, both Client and Server "IP address range" is mentioned in the singular. I think appropriate s/range/ranges/ might make sense.
Lars Eggert (was Discuss) No Objection
Section 1. , paragraph 2, comment: > 18 years have passed since IETF recommended test methodology and > terminology for firewalls initially ([RFC3511]). The requirements > for network security element performance and effectiveness have > increased tremendously since then. In the eighteen years since These sentences don't age well - rephrase without talking about particular years? Section 184.108.40.206. , paragraph 2, comment: > The server pool for HTTP SHOULD listen on TCP port 80 and emulate the > same HTTP version (HTTP 1.1 or HTTP/2 or HTTP/3) and settings chosen > by the client (emulated web browser). The Server MUST advertise An H3 server will not listen on TCP port 80. In general, the document needs to be checked for the implicit assumption that HTTP sues TCP; there is text throughout that is nonsensical for H3 (like this example). The Server MUST advertise Section 6.3. , paragraph 6, comment: > The average number of successfully established TCP connections per > second between hosts across the DUT/SUT, or between hosts and the > DUT/SUT. The TCP connection MUST be initiated via a TCP three-way > handshake (SYN, SYN/ACK, ACK). Then the TCP session data is sent. > The TCP session MUST be closed via either a TCP three-way close > (FIN, FIN/ACK, ACK), or a TCP four-way close (FIN, ACK, FIN, ACK), > and MUST NOT by RST. This prohibits TCP fast open, why? Also, wouldn't it be enough to say that the connection needs to not abnormally reset, rather than describing the TCP packet sequences that are acceptable? Given that those are not the only possible sequences, c.f., loss and reordering. Section 6.3. , paragraph 6, comment: > The average number of successfully completed transactions per > second. For a particular transaction to be considered successful, > all data MUST have been transferred in its entirety. In case of > HTTP(S) transactions, it MUST have a valid status code (200 OK), > and the appropriate FIN, FIN/ACK sequence MUST have been > completed. H3 doesn't do FIN/ACK, etc. See above. Section 220.127.116.11. , paragraph 4, comment: > a. Number of failed application transactions (receiving any HTTP > response code other than 200 OK) MUST be less than 0.001% (1 out > of 100,000 transactions) of total attempted transactions. > > b. Number of Terminated TCP connections due to unexpected TCP RST > sent by DUT/SUT MUST be less than 0.001% (1 out of 100,000 > connections) of total initiated TCP connections. Why is a 0.001% failure rate deemed acceptable? (Also elsewhere.) Section 7.2.1. , paragraph 2, comment: > Using HTTP traffic, determine the sustainable TCP connection > establishment rate supported by the DUT/SUT under different > throughput load conditions. H3 doesn't do TCP. Section 18.104.22.168. , paragraph 9, comment: > The client SHOULD negotiate HTTP and close the connection with FIN > immediately after completion of one transaction. In each test > iteration, client MUST send GET request requesting a fixed HTTP > response object size. H3 doesn't do TCP FIN. Section 22.214.171.124. , paragraph 6, comment: > c. During the sustain phase, traffic SHOULD be forwarded at a > constant rate (considered as a constant rate if any deviation of > traffic forwarding rate is less than 5%). What does this mean? How would traffic NOT be forwarded at a constant rate? Section 126.96.36.199. , paragraph 5, comment: > d. Concurrent TCP connections MUST be constant during steady state > and any deviation of concurrent TCP connections SHOULD be less > than 10%. This confirms the DUT opens and closes TCP connections > at approximately the same rate. What does it mean for a TCP connection to be constant? Section 7.4.1. , paragraph 4, comment: > Scenario 1: The client MUST negotiate HTTP and close the connection > with FIN immediately after completion of a single transaction (GET > and RESPONSE). H3 sessions don't send TCP FINs. (Also elsewhere.) Section 7.7. , paragraph 1, comment: > 7.7. HTTPS Throughput Is this HTTPS as in H1, H2 or H3? All of the above? Found terminology that should be reviewed for inclusivity; see https://www.rfc-editor.org/part2/#inclusive_language for background and more guidance: * Term "dummy"; alternatives might be "placeholder", "sample", "stand-in", "substitute". Thanks to Matt Joras for their General Area Review Team (Gen-ART) review (https://mailarchive.ietf.org/arch/msg/gen-art/NUycZt5uKAZejOvCr6tdi_7SvPA). ------------------------------------------------------------------------------- All comments below are about very minor potential issues that you may choose to address in some way - or ignore - as you see fit. Some were flagged by automated tools (via https://github.com/larseggert/ietf-reviewtool), so there will likely be some false positives. There is no need to let me know what you did with these suggestions. Section 4.1. , paragraph 8, nit: > actively inspected by the DUT/SUT. Also "Fail-Open" behavior MUST be disable > ^^^^ A comma may be missing after the conjunctive/linking adverb "Also". Section 4.2. , paragraph 9, nit: > security vendors implement ACL decision making.) The configured ACL MUST NOT > ^^^^^^^^^^^^^^^ The noun "decision-making" (= the process of deciding something) is spelled with a hyphen. Section 4.2.1. , paragraph 1, nit: > the MSS. Delayed ACKs are permitted and the maximum client delayed ACK SHOUL > ^^^^ Use a comma before "and" if it connects two independent clauses (unless they are closely connected and short). Section 188.8.131.52. , paragraph 3, nit: > the MSS. Delayed ACKs are permitted and the maximum server delayed ACK MUST > ^^^^ Use a comma before "and" if it connects two independent clauses (unless they are closely connected and short). Section 184.108.40.206. , paragraph 4, nit: > IPv6 with a ratio identical to the clients distribution ratio. Note: The IAN > ^^^^^^^ An apostrophe may be missing. Section 220.127.116.11. , paragraph 2, nit: > S throughput performance test with smallest object size. 3. Ensure that any > ^^^^^^^^ A determiner may be missing. Section 6.1. , paragraph 19, nit: > sion with a more specific Kbit/s in parenthesis. * Time to First Byte (TTFB) > ^^^^^^^^^^^^^^ Did you mean "in parentheses"? "parenthesis" is the singular. Section 7.5.3. , paragraph 2, nit: > s and key strengths as well as forward looking stronger keys. Specific test > ^^^^^^^^^^^^^^^ This word is normally spelled with a hyphen. Section 18.104.22.168. , paragraph 3, nit: > SHOULD NOT be reported, if the above mentioned KPI (especially inspected thro > ^^^^^^^^^^^^^^^ The adjective "above-mentioned" is spelled with a hyphen. Section 7.6.1. , paragraph 4, nit: > s and key strengths as well as forward looking stronger keys. Specific test > ^^^^^^^^^^^^^^^ This word is normally spelled with a hyphen. Section 22.214.171.124. , paragraph 1, nit: > * Accuracy of DUT/SUT statistics in term of vulnerabilities reporting A.2. T > ^^^^^^^^^^ Did you mean the commonly used phrase "in terms of"? Section 7.9.4. , paragraph 2, nit: > tected attack traffic MUST be dropped and the session SHOULD be reset A.3.2. > ^^^^ Use a comma before "and" if it connects two independent clauses (unless they are closely connected and short).
Martin Duke No Objection
(126.96.36.199) RFC8446 is not the reference for HTTP/2. (188.8.131.52), (184.108.40.206) Is there a reason that delayed ack limits are defined only in terms of number of bytes, instead of time? What if an HTTP request (for example) ends, and the delayed ack is very long? Note also that the specification for delayed acks limits it to every two packets, although in the real world many endpoints use much higher thresholds. [It's OK to keep it at 10*MSS if you prefer]. (220.127.116.11) What is a "TCP persistence stack"?
Murray Kucherawy (was Discuss) No Objection
Thanks for handling my DISCUSS about all the normative language used here. Nits not yet mentioned by others: Section 4.2: * "... users SHOULD configure their device ..." -- s/device/devices/ (unless all users share one device) Section 6.3: * "The value SHOULD be expressed in millisecond." -- s/millisecond/milliseconds/
Roman Danyliw (was Discuss) No Objection
Thanks for addressing my DISCUSS and COMMENT feedback.
Zaheduzzaman Sarker No Objection
Thanks for the efforts on this specification. I have been part of writing two testcase documents for real-time congestion control algorithms and understand getting things in a reasonable shape is hard. I have similar observation as Murray and Eric when it comes to obsoleting the previous specification. Hence supporting their discusses. Some more comments/questions below - * Section 5 : what is "packet loss latency" metric? where is it defined? how do I measure? * Traffic profile is missing in all the benchmark test which is a MUST to have. If this is intentional then a rational need to be added. * Section 7.3 and 7.7 : The HTTP throughput will look different not only because of object size but also how often the request are sent. If the requests are sent all at once the resulted throughput may look like a long file download and if they are sparse then they will look small downloads in a sparse timeline. Here, it is not clear to me what is the intention. Again the traffic profile is missing and I am started to think that Section 18.104.22.168 might be part of Section 22.214.171.124. * Section 7.4 and 7.8 : I can have similar view as per my comment on Section 7.3. This is not clear to me that only object size matter here on the latency.
(Benjamin Kaduk; former steering group member) (was Discuss) No Objection
[Updated to remove my Discuss point, as my colleagues have convinced me that my concern was not reasonable] I support Roman's Discuss (which you have already begun to resolve, thank you). Perhaps it is time to retire the term "SSL" in favor of the current protocol name, "TLS". Section 4.1 In some deployment scenarios, the network security devices (Device Under Test/System Under Test) are connected to routers and switches, which will reduce the number of entries in MAC or ARP tables of the Device Under Test/System Under Test (DUT/SUT). If MAC or ARP tables have many entries, this may impact the actual DUT/SUT performance due to MAC and ARP/ND (Neighbor Discovery) table lookup processes. This I understand the motivation for benchmarking the maximum performance from the device under controlled circumstances, but it also seems that if a device really will exhibit degraded performance due to the number of entries in its MAC/ARP table, that would be useful information to have. Perhaps a remark about how future work could include repeating benchmarking results with different numbers of other devices on the local network segment is in order. Section 4.2 Table 1 and Table 2 below describe the RECOMMENDED and OPTIONAL sets of network security feature list for NGFW and NGIPS respectively. I agree with the IoTdir reviewer that Certificate Validation should surely be a recommended feature for NGFWs. But see also the DISCUSS point. | SSL Inspection | DUT/SUT intercepts and decrypts inbound HTTPS | | | traffic between servers and clients. Once the | | | content inspection has been completed, DUT/SUT | | | encrypts the HTTPS traffic with ciphers and | | | keys used by the clients and servers. | This description could stand to be more clear, especially in light of the fundamental differences between TLS 1.2 and TLS 1.3. First, the description starts off with "intercepts and decrypts" and then goes on to say that once inspection is over, the DUT/SUT "encrypts the HTTPS traffic". Does this mean that the DUT/SUT specifically needs to re-encrypt after decrypting, or is it permissible to retain the original ciphertext and just relay that ciphertext onward? Second, in TLS 1.3, it is by construction impossible for a single set of traffic encryption keys to be shared by all three of client, server, and DUT/SUT -- RSA key transport is forbidden and ephemeral key exchange is required. In order to perform content inspection, such a middlebox needs to be able to impersonate the server to the client (i.e., holding a certificate and private key that is trusted by the client and represents the identity of the real server, which is expected to require specific configuration on the client to enable) and complete separate TLS connections to client and to server. In this scenario the middlebox must remain as a "machine in the middle" for the duration of the entire connection and decrypt/reencrypt all content using the different keys for the client/middlebox and middlebox/server connections. * Geographical location filtering, and Application Identification and Control SHOULD be configured to trigger based on a site or application from the defined traffic mix. Do we have a sense for how sensitive the performance results are going to be with respect to the proportion of traffic that triggers these classes of filtering/control? Would it be appropriate to require that this breakdown be included in the report? Section 126.96.36.199 validation. Depending on test scenarios and selected HTTP version, HTTP header compression MAY be set to enable or disable. This I didn't think it was possible to fully disable header compression for HTTP/2 and HTTP/3 (just to set the dynamic table size to zero). [RFC8446] defines the following cipher suites for use with TLS 1.3. [...] TLS_AES_128_CCM_8_SHA256 is marked as Recommended=N in the registry; I think we should indicate that there is little need to benchmark it except for those special circumstances where the cipher is appropriate. Even TLS_AES_128_CCM_SHA256 (with full-length authentication tag) is mostly only going to be used in IoT environments and is likely not needed for a target of "enterprise grade encryption cipher suites". Section 188.8.131.52 server, TLS 1.2 or higher MUST be used with a maximum record size of 16 KByte and MUST NOT use ticket resumption or session ID reuse. The Why is TLS resumption prohibited? (As a technical matter, TLS 1.3 resumption uses a different mechanism than the two TLS 1.2 resumption mechanisms, and it may be prudent to specifically note whether TLS 1.3 resumption is also forbidden.) server SHALL serve a certificate to the client. The HTTPS server MUST check host SNI information with the FQDN if SNI is in use. What does "check host SNI information with the FQDN" mean? Where is the FQDN in question obtained from? (In §184.108.40.206 we say that the proposed (SNI) FQDN is compared to "the domain embedded in the certificate". Note that, of course, the certificate can contain more than one domain name, e.g., via the now-quite-common use of subjectAltName.) Section 6.1 e. Key test parameters * Used cipher suites and keys Do we really need to report the specific *keys* used (as opposed to cryptographic parameters of the TLS connection like the group used for key exchange, algorithm and key size of the server certificate, etc.)? * Percentage of encrypted traffic and used cipher suites and keys (The RECOMMENDED ciphers and keys are defined in Section 220.127.116.11) For what it's worth, trends in generic web traffic are rapidly converging towards near-universal HTTPS usage. I am not really sure that measuring unencrypted traffic is going to be very interesting to many users (though I concede that some will still be using it and find the corresponding benchmarking results useful). Section 18.104.22.168 RECOMMENDED HTTP response object size: 1, 16, 64, 256 KByte, and mixed objects defined in Table 4. With the explosion of video use on the modern Web, it might be worth revisiting these recommended object sizes. Is there likely to be value in having very large objects for any of the tests? Section 7.6.1, 7.7.1 Test iterations MUST include common cipher suites and key strengths as well as forward looking stronger keys. Specific test iterations How/where would an implementor obtain more guidance on "common cipher suites and key strengths" and "forward looking stronger keys"? (With understanding that this guidance will change over time and cannot be permanently enshrined in this RFC-to-be.) (Why does Section 7.8.1 not have similar language?) Section 9 Hmm, I thought we typically had some language about how if the benchmarking techniques specified in this document were used outside a laboratory isolated test environment, security and other risks could arise (e.g., due to DoS of nearby nodes/services). Appendix A I agree with Roman that the text around "CVEs" is imprecise and should be talking about exploits that are identified by CVEs.