Agenda IETF123: maprg: Fri 07:30
agenda-123-maprg-sessb-03
| Meeting Agenda | Measurement and Analysis for Protocols (maprg) RG | |
|---|---|---|
| Date and time | 2025-07-25 07:30 | |
| Title | Agenda IETF123: maprg: Fri 07:30 | |
| State | Active | |
| Other versions | markdown | |
| Last updated | 2025-07-25 |
IRTF maprg agenda for IETF-123 (Madrid)
Date: Friday, 25 July 2025, Session I 9:30-11:00
Full client with Video: https://meetecho.ietf.org/conference/?group=maprg&short=maprg&item=1
Room: Auditorio
IRTF Note Well: https://irtf.org/policies/irtf-note-well-2019-11.pdf
Agenda
-
Overview and Status - Mirja/Dave (5 min)
-
Breaking Through the Clouds: Performance Insights into Starlink’s Latency and Packet Loss - Robert Richter (10 mins)
-
It's a bird? It's a plane? It's CDN!: Investigating Content Delivery Networks in the LEO Satellite Networks Era - Nitinder Mohan (15 mins)
-
Measuring Anycast Performance: Catchment, RTT, and Optimal Site - Remi Hendriks (10 mins)
-
ReACKed QUICer: Measuring the Performance of Instant Acknowledgments in QUIC Handshakes - Jonas Mücke (15 mins)
-
Examining the Heterogeneous Throughput Performance Landscape of QUIC Implementations - Roland Bless (15 mins)
-
QUIC Steps: Evaluating Pacing Strategies in QUIC Implementations - Marcel Kempf (15 mins)
Abstracts
Breaking Through the Clouds: Performance Insights into Starlink’s Latency and Packet Loss
Authors: Robert Richter, Vasilis Ververis, Vaibhav Bajpai
Abstract:
Our modern era is experiencing a rapid evolution in satellite Internet access. However, it is unclear how well these systems perform and what we can expect from Internet access via satellites. Previous research has studied the performance and resilience of such systems, uncovering several drawbacks (e.g., high packet loss and unstable performance). In this work, we thoroughly investigate the characteristics of the Starlink network. We scrutinize the TLS handshake latency, packet loss, and the diurnal latency variation to establish a correlation between these factors. To achieve this, we utilize historical data measured by RIPE Atlas and Cloudflare Radar from 2022-01-01 to 2024-06-30.
We find no statistically significant correlation between latency and packet loss in the Starlink satellite network. However, we discover an intriguing pattern suggesting that Starlink exhibits specific latencies more consistently than others. This finding contradicts recent research that claims a significantly better performance of Starlink with median latencies substantially lower than 80 ms. Furthermore, our findings reveal significant geographical variations, where even highly developed countries such as Germany experience packet loss ratios exceeding 10%.
Additionally, we examined Starlink’s routing behavior, which reveals two sudden spikes in latency. The first spike is attributable to the transition between satellite and terrestrial networks, while the second is seemingly unrelated to Starlink.
Publication:
It's a bird? It's a plane? It's CDN!: Investigating Content Delivery Networks in the LEO Satellite Networks Era
Authors: Rohan Bose, Saeed Fadaei, Nitinder Mohan, Mohamed Kassem, Nishanth Sastry, Jörg Ott
Abstract:
The dramatic expansion of Low-Earth-Orbit (LEO) satellite constellations is reshaping how end users attach to the Internet—and in doing so, it disrupts the locality assumptions that underpin today's Content Delivery Networks (CDNs). A single Starlink terminal may send traffic through a ground station thousands of kilometers away, anchoring its public IP address to an overseas point-of-presence (PoP) and triggering cache mis–selections, inflated latency, and even incorrect geo-blocking. In this talk, we present the first comprehensive, multi-layer study of how Starlink’s routing behaviors impact CDN performance. We integrate (i) controlled experiments via our NetMet browser extension running extensive web-browsing trials over both Starlink and terrestrial links; (ii) crowdsourced insights from over 18 months of M-Lab speed tests and Cloudflare’s Anycast Insight Measurement (AIM) logs; (iii) large-scale active measurements including 3.1 M traceroutes and 4.2 M DNS queries from 98 Starlink-connected RIPE Atlas probes across 25 countries; and (iv) Regional deep dives in Africa—using dedicated dishes in Ghana and Zambia—both before and after Starlink’s 2025 launch of two new African PoPs. Our results reveal that PoP distance alone accounts for up to 50 % of page-fetch latency variance and that relocating PoPs from Europe to Africa slashes median fetch times by 60 %. We also observe pervasive cache mis-selections and performance degradations due to geolocation mismatches. Finally, we discuss potential CDN adaptations—such as satellite-aware anycast steering and dynamic cache placement—to restore locality and optimize user experience in the era of LEO networks.
Publication:
- HotNets'24: https://dl.acm.org/doi/10.1145/3696348.3696879
Measuring Anycast Performance: Catchment, RTT, and Optimal Site
Authors: Remi Hendriks
Abstract:
Verfploeter is a measurement technique that allows anycast operators to map the set of addresses that route to each anycast site, by using an active probing technique that utilizes ICMP-responsive hosts on the Internet.
In essence, it sends a ping using an anycast source address and captures the receiving anycast site for the ping-response, thereby mapping the ‘catching’ anycast site for each hitlist target.
We extended this tool to support TCP and UDP probing, as ping catchment mappings might be invalid for anycast deployments that run TCP and UDP services.
Furthermore, we added support for IPv6 to provide tooling for IPv6 anycast research which is largely unexplored.
Additionally, we introduce novel measurements like measuring actual anycast RTTs experienced by clients, measuring the optimal performance of the anycast deployment, and more.
In this talk we will showcase the large variety of measurements that our tooling can perform (designed to be a Swiss knife), including direct examples of how such measurements can be used to assess anycast performance and troubleshoot sub-optimal anycast routing.
We will showcase its effectiveness by showing how anycast routing performs at Internet scale using our anycast testbed, and discuss how it is used by industry partners to improve their production anycast deployments.
With this talk we hope to reach researchers that may have an interest in such tooling and operators that can benefit greatly from using our tooling to improve their anycast services.
Publication:
- So-far unpublished
ReACKed QUICer: Measuring the Performance of Instant Acknowledgments in QUIC
Handshakes
Authors: Jonas Mücke, Marcin Nawrocki, Raphael Hiesgen, Thomas C. Schmidt, Matthias Wählisch
Abstract:
In this paper, we present a detailed performance analysis of QUIC
instant ACK, a standard-compliant approach to reduce waiting times
during the QUIC connection setup in common CDN deployments. To
understand the root causes of the performance properties, we combine
numerical analysis and the emulation of eight QUIC implementations using
the QUIC Interop Runner. Our experiments comprehensively cover packet
loss and non-loss scenarios, different round trip times, and TLS
certificate sizes. To clarify instant ACK deployments in the wild, we
conduct active measurements of 1M popular domain names. For almost all
domain names under control of Cloudflare, Cloudflare uses instant ACK,
which in fact improves performance. We also find, however, that instant
ACK may lead to unnecessary retransmissions or longer waiting times
under some network conditions, raising awareness of drawbacks of instant
ACK in the future.
Publication:
Examining the Heterogeneous Throughput Performance Landscape of QUIC Implementations
Authors: Michael König, Sebastian Rust, Martina Zitterbart, Björn Scheuermann
Abstract:
QUIC, a UDP-based transport protocol that integrates TLS for security and reduces connection latency, has gained
widespread adoption and is now underpinning a substantial share of data traffic for major platforms like Cloudflare, Google, and
Facebook. Given its growing deployment across major Internet platforms, there is growing attention on the performance potential
of QUIC implementations. This paper provides an in-depth study of different QUIC implementations on a hardware testbed with
10 Gbit/s links. Our focus is on the achievable goodput in different scenarios and with different implementations. In contrast to other
performance studies of QUIC, we investigated QUIC together with multiple versions of HTTP and used multiple streams for the
data transfer. Our results show that merely choosing a different application protocol (i.e., HTTP/3 versus HTTP/0.9) can reduce
goodput by as much as 27 %. Dedicated traffic generators can further significantly boost achievable goodput, in cases more
than doubling the throughput obtained via HTTP. Moreover, our analysis reveals that increasing the number of QUIC streams
may potentially double the throughput of multi-segment data transfers, depending on the implementation. Additionally, certain
QUIC implementations can saturate a 10 Gbit/s link by increasing packet sizes, indicating that QUIC packet processing speed,
rather than raw transmission capacity, is a primary bottleneck.
These findings highlight QUIC’s capabilities, limitations, and implementation heterogeneity. The differences between QUIC
and QUIC+HTTP throughput emphasize the need for dedicated performance tests. Understanding these distinctions is crucial for
analyzing, optimizing, and maximizing QUIC’s performance.
Publication:
- IFIP Networking 2025 - author's copy: https://doc.tm.kit.edu/2025-Examining-the-Heterogeneous-Throughput-Performance-Landscape-of-QUIC-Implementations-Koenig-et-al.pdf
QUIC Steps: Evaluating Pacing Strategies in QUIC Implementations
Authors: Marcel Kempf, Simon Tietz, Benedikt Jaeger, Johannes Späth, Georg Carle, Johannes Zirngibl
Abstract:
Pacing is a key mechanism in modern transport protocols, used to
regulate packet transmission timing to minimize traffic burstiness,
lower latency, and reduce packet loss. Standardized in 2021, QUIC is a
UDP-based protocol designed to improve upon the TCP/TLS stack. While the
QUIC protocol recommends pacing, and congestion control algorithms like
BBR rely on it, the user-space nature of QUIC introduces unique
challenges. These challenges include coarse-grained timers, system call
overhead, and OS scheduling delays, all of which complicate precise
packet pacing.
This paper investigates how pacing is implemented differently across
QUIC stacks, including quiche, picoquic, and ngtcp2, and evaluates the
impact of system-level features like GSO and Linux qdiscs on pacing.
Using a custom measurement framework and a passive optical fiber tap, we
establish a baseline with default settings and systematically explore
the effects of qdiscs, hardware offloading using the ETF qdisc, and GSO
on pacing precision and network performance. We also extend and evaluate
a kernel patch to enable pacing of individual packets within GSO
buffers, combining batching efficiency with precise pacing.
Kernel-assisted and purely user-space pacing approaches are compared. We
show that pacing with only user-space timers can work well, as
demonstrated by picoquic with BBR. With quiche, we identify FQ as a
qdisc well-suited for pacing QUIC traffic, as it is relatively easy to
use and offers precise pacing based on packet timestamps. We uncovered
that internal mechanisms, such as a library’s spurious loss detection
logic or algorithms such as HyStart++, can interfere with pacing and
cause issues like unstable congestion windows and increased packet loss.
Our findings provide new insights into the trade-offs involved in
implementing pacing in QUIC and highlight potential optimizations for
real-world applications like video streaming and video calls.