March 23, 2022
Chairs: Mirja Kühlewind, Dave Plonka
Minutes: Alex Mayrhofer
Hammas Bin Tanveer (20 mins)
Scanning in v4 is well understood - but not the same in v6 (Larger address space, identification of active space required - research focuses on strategies for those).
Scanning - "sending unsolicited communication"
v4 scans of the whole address space completes in < 5mins.
v6: Brute force is not feasible due to address space size
Two strategies - IP scanning & NXDOMAIN scanning.
IP scanning reduces the search space (allocations, zone files, etc.)
NXDOMAIN scanning: reducing the search space by RFC8020-based semantics by scanning .ip6.arpa. (non-empty terminals) - provides an efficient side channel for scanning.
Peter Thomassen: Works only when reverse DNS delegations exist - fraction of active reverse DNS?
Hammas: There is previous work
Mimick an active IPv6 address space, capture scanning traffic (both DNS and IP), used a prev. unannounced /56.
"Lower byte" addresses as well as "random" addresses used for services.
Scanning increased in both traffic as well as # probes after deployment.
Conclusion: NXDOMAIN scanner are very efficient, while IP scanners are constantly hitting the range.
IP based scanner target both active and control subnets, NXDOMAIN scanners are much more effiicent.
Most scanners are not (yet) using NXDOMAIN responses.
Adress discovery methods are differnt than expected.
neighbor scanning happens.
Alex Mayrhofer: Any observations on how many nameservers support RFC8020?
A: nameserver we used did, no figures
Ralf Weber: Defense mechanishm would be to put in "nothing", or "everyting"
A: Yes, one of the strategies recommended in the paper
Peter Koch: Previous work in a paper, RFC 8020 was only a clarification. discussion will be taken offline
Petr Spacek: Solution would be adding a wildcard DNS. Please don't say 8020 is wrong.
Moritz Müller (10 mins)
Was initiated by ICANN, cooperation with nlnet labs. Goal was to recommend a matrix to ICANN of what should be measured. Talk will share some of the challenges and welcomes feedback from measurement community.
challenge 1: Two-sides deployment - signing and validation. Each of the sides has many metric (signing: algorithms, key rolling policies, nsec*; validation: alogrithms, trust anchors, signaling protocols, …)
DNSSEC-related protocols are still extended / worked on (CDS / CDSKEY eg.) - should that also be measured? DANE?
challenge 2: Measreuemtn technique - active vs. passive, end user vs resolver vs. auth server. raw vs aggregated data…
study existing research, perform gap analysis, assessment framework to assess measurement techniques.
Assessment aspects: coverage, reproducibility, feasibility
Alex Mayrhofer: #1 metric would be the global percentage of DNSSEC-protected transactions on the public internet
Jerome Mao (20 mins)
Considered Angles for the paper:
Approach: Compel resolver to engage with our auth DNS, which forces TCP fallback. check for TCP followup.
Open Resolvers, Email bouncing, RIPE Atlas, Logs from CDN operators.
TCP queries are hard to relate to UDP queries due to resolver infrastructure (ingress / egress).
~19% of queries come from different IP addresses
~95% of resolvers are TCP-fallback capable.
3-5% of domains fail to resolve via TCP
11 out of 47 CDNs fail on DNS over TCP
13.5% do re-use TCP connections
33% of popular websites and 4 CDN providers immediately close connection post response -> failure
Mirja Kühlewind: Did you reach out to CDN operators?
A: Yes, we did.
Ralf Weber: "Some servers were not reachable" - what does that mean?
A: means a server in a resolution chain failed, not the whole resolution
Ralf: Any numbers on "complete failure", rather than servers - domains that would not resolve. Domain where at least one server responds are still okay.
A: Will take this offline
Peter Thomassen: 1) how assess "signifcance" of a resolver - bias? 2) common view that in enterprise, TCP-DNS might be blocked - any insights on that?
A: we know from logs from CDN to what extent a resolver is used.
Mirja: take second part to the list - might need ffs.
Mike Kosek (20 mins)
History of DNS transport - unencrypted UDP with fallback to TCP. Move to encrypted DNS - suffers from TCP problems. QUIC eg. removes head-of-line-blocking.
AdGuard / nextdns offer public DNS o QUIC servers.
Scanned IPv4 address space for 29 weeks from TU Munich - DoQ as baseline, collected DoQ and QUIC version stats.
found 1217 resolvers (from ~830 up during the 29 weeks). High fluctuation. 7 different DoQ / QUIC version pairs.
Major change in last weeks of 2021 was a move by AdGuard home software to new version.
Response times - location bias because of one vantage point, did comparative measurements to other transports. Did not use TLS resumption or 0-RTT
found no protocol specific path influences.
DoQ is slower than DoTCP, but faster than DoT/DoH
Only 20% of implementations use full potential of QUIC's response time potential. But DoQ is already the best choice for encrypted DNS
Paper and Code/Data available via slide deck.
Lorenzo Colitti: rolling out DoH3 (which was not tested) - seemed easier to implement (more server support). Hits on BW due to header size, though. Measurement of that would be interesting. DoQ on Android would sum up to a few thousands of queries, so it matters. Looking at the QUIC bug mentioned in the paper
Joerg Deutschmann (20 mins)
Sat networks rely on "performance enhancing proxies", which are not applicable for encrypted transports (such as QUIC), hits on performance.
Tests were only performed with onse implementation.
QUIC interop runner: https://interop.seemann.io/ - includes 2 performance tests. Created Sat-related performance tests (higher RTTs) https://interop.cs7.tf.fau.de/
Also used real sat links for testing (Astra & Eutelsat) - only a single vantage point, though. Allows comparison of simulation vs. real sat link.
Very mixed results for sat…
nginx and quant server yield many failures.
link utilization is not good in any case - influence by CC algorithms
completion graph - slow start takes very long with many implementations.
General: Very poor performance, hard to debug each combination in detail. More detailed analysis planned, eg. long term measurements.
Hammas Tanveer: Can this be transferred to lower earth orbit, especially with sat changing? Handover effects?
A: Have some results with Starlink, which performs quite well, we don't know about handover effects yet.
Lorenzo Colitti: Latency-dependent aggressiveness? Stupid or good idea? Sure you will share data with developers… out of curiousity…
A: Yes, there are "if-then" statements in code, it does help indeed. Other approaches discussed in QUIC WG. Parameter tuning vs. congestion control, all basic questions. Helps if implementations add Sat test case to their implementations, hope to motivate them to include this.
Matthias Waehlisch (20 mins)
Was presented at last year's IMC
Is QUIC used for DoS? Yes!
Point: During 1st RTT server responds to an unverified source. first interaction allows for reflection/amplification attacks. Rather unlikely because QUIC allows only 3 times more data than what is being sent - limits amplification to 3 (limited by design)
Resource Exhaustion - allocating states on server by spoofing connection requests. Network telescopes can observe these attempts. Levereged in the study.
used UCSD telescope (/9 - ~2% of IPv4 address space!) during april 2021. receives lots of malicious traffic
used wireshark - filtered by UDP / 443 traffic, and then dissection to filter out non-QUIC traffic. Detected 92M QUIC packets, split into requests and responses.
Was sanitizing necessary? Yes! Scanners / research dominates!
Vast responses received are in content provider networks.
Infering DoS attacks by qualifying by session time, packet numbers, and pakcet rate.
Found 2905 qualifying flows - 58% to Google, 25% to facebook.
Trend for destinations remains even when extending identification limits.
Conclusion: Attackers leverage even new protocols
Mitigation: QUIC RETRY (similar to TCP cookies)
Do floods "work"? Yes, tests with nginx show that from 100 pps onwards, service is degraded.
QUIC RETRY enabling prevents the server to be exhausted (on that test)
RETRY is not found in Backscatter. Update 2022 - two cases showed RETRY mitigation.
Paper / artifacts available via Slides
Alex: packet count seems small, was there an increase notable in terms of pps? Doesn't look like "serious" attacks
A: Will need to check whether pps grew between 2021 and 2022.
Hammas: Some scanners use more than one address, did that appear as "attack"?
A: see slide on attack classification