Minutes IETF101: maprg
Measurement and Analysis for Protocols
||Minutes IETF101: maprg
Measurement and Analysis for Protocols Research Group (maprg) Agenda @ IETF-101
Date: Tuesday, March 20, 9:30-12:00 - Tuesday Morning session I
Scribed by: Mat Ford
Abstracts of all talks are included at the foot of these notes.
## Intro & Overview
Dave Plonka (DP) and Mirja Kühlewind (MK)
## Heads-up talk: Challenges in measuring 1 Gbps access speeds
## Heads-up talk: Zesplot in five minutes - An attempt to visualise IPv6
address space Luuk Hendriks
DP: Interesting transitioning from tool to useful measurement (anycast)
## Update on previous presentation: Measuring the quality of DNSSEC deployment
Roland van Rijswijk-Deij
## Update on previous presentation: Update on IPv6 Performance Data
Tommy Pauly (TP)
Tim Chown: Where are UK cellular measurements coming from?
TP: This is just sampling from Apple devices, iPhones etc., other devices may
have been configured differently. This is the view that we have of UK carriers.
TC: I believe some UK operator has about a million handsets on IPv6, but maybe
not Apple devices.
MK: Do you have absolute numbers?
TP: For this specific measurement, it's order of 1000 connections, so very
Lorenzo Colliti: Based on your charts I get the impression that across the
board you end up using IPv6 half the time of the total connections.
TP: Based on happy-eyeballs data that would be mainly from hosts that are not
LC: Say you had a large head start for IPv6, how would the numbers change?
TP: Looking at our happy-eyeballs data, when IPv6 is available in the network
and you have a dual-stack host, we use IPv6 95% of the time.
LC: Maybe I'm just misreading graph. Why is WiFi used only 14% of time if there
is 39% IPv6 on US WiFi?
TP: I imagine it's because a lot of our connections are to things that don't
have dual-stacked service.
LC: So the 15% difference is the content factor.
Michael Tuexen: What is the RTT of a TCP connection?
TP: This is smoothed average. At the end of the connection we look at smoothed
average of what TCP saw during lifetime of connection.
Geoff Huston: You're not measuring IPv6 and IPv4 to the same endpoint. You're
measuring IPv6 when happy eyeballs says use IPv6, IPv4 when happy eyeballs says
use IPv4 and IPv4 when there is no IPv6 to use, right?
TP: Yes, lots of biases in here.
GH: I get a very different answer when I look at one endpoint and IPv6 and IPv4
to the same endpoint. If you changed happy eyeballs timers would you change
TP: We're including all IPv4 connections, some of which not eligible for happy
eyeballs. It's just trying to get overall picture.
GH: You're comparing oranges and mandarins!
## Update on client adoption for both TLS SNI and IPv6
## On the use of TCP's Initial Congestion Window in IPv4 and by Content
Delivery Networks Jan Rüth (JR)
Bob Briscoe: There was a presentation at a recent IETF that measured ECN but
also incidnetally measured IW10 - got similar results - didn't go into as much
detail - that study showed some measurement results for mobile networks - I can
post URL on the mailing list.
Stephen Strowes: I'm looking forward to seeing your IPv6 results! CDNs do
pretty aggressive MSS clamping - interested to see how that affects your
JR: We were wondering how CDNs were delivering their initial windows - some of
them seem to pace their data, not all.
## A First Look at QUIC in the Wild
Jan Rüth (JR)
DP: This is an upcoming publication next week at PAM so you're some of the
first to see these results.
Dmitri Tikhonov: LiteSpeed is spelled L I T E.
MK: There is a question in jabber about the availability of data? Your own
measurements might be available or shareable?
JR: Some measurement data is available - IXP and ISP traces are available -
MAWI data is available from their website, we can't publish TLD data because
domain lists are under NDA, https://quic.netray.io - IPv4 weekly scans run on
Friday, data available on Saturdays.
## Adoption, Human Perception, and Performance of HTTP/2 Server Push
Torsten Zimmermann (TZ)
DP: How do you translate from an Alexa1M domain name to a CDN to determine who
is hosting content?
TZ: Started with IPs, mapped to AS numbers, also doing regular DNS
measurements, and analyse DNS chain CNAMES and identifiers - this could be an
DP: Hint - using passive DNS can tell you the collection of FQDNs below a name
- there are many Alexa services that use many CDNs simultaneously.
Erik Nygren: Especiallly as more sites move to SNI the zmap IPv4 scan is going
to become decreasingly useful because many servers will behave differently
depending on which SNI they see. IPv6 deployment growth is also a
consideration. Alexa very much focuses on www style sites but increasing amount
of content is on separate domains for images, videos etc. that don't show up as
well on that list. When looking at push performance behaviour - you show that
there are different providers providing push - is that common across the board
or did some providers have behaviours that were generally negative while others
were more generally positive?
TZ: We couldn't attritbute that to the provider itself, it's more like the
configuration of the website so the user of the website. Push requires manual
configuration, but we also see some sites that use plugins, like WordPress
plugins, scan the file tree, observe static files to do that configuration
EN: There are starting to be product features from various vendors that analyse
site behaviour and then modify push behaviour. You may see cases where that
kind of closed-loop analysis of what to push has very different beaviours than
more static configurations.
Ian Swett: Thanks for publishing this work - I've long been looking for data on
the benefits or otherwise of push - it's hard to get good data and this is the
best I've seen - it makes clear the challenges of making push work in the wild.
Have you considered evaluating hash digest and whether that makes things much
TZ: We will look into that. There will be a talk at httpbis meeting today on
hash digest - there's still a lot that can wrong with server push - it's a cool
feature but maybe standardised too early - without hash digest it shouldn't be
## Inferring BGP Blackholing Activity in the Internet
Georgios Smaragdakis (GS)
Alexander Azimov: Unfortunately blackholing is a service - it's not only DDOS
mitigation - it's also a service for censorship - especially in Russia it is
used to block some resources and it is quite popular.
GS: I didn't want to say that, you said that, fine. Some of the long-lived ones
are candidates for censorship - we have enough indications for that but we
don't have ground truth - if you see days of blackholing and also we use other
analysis we can find that these are for websites of political content.
AA: It looks like blackholing plus hijacks.
## Deploying MDA Traceroute on RIPE Atlas Probes
Kevin Vermeulen (KV)
DP: Interesting to contrast what you've found with this breadth of paths versus
what we usually call the diameter of Internet. We usually see a maximum length.
Is this terminology with diamonds something new or is there some existing
literature about this?
KV: There is a paper from Transactions on Networking that conducted a survey 7
years ago, it defined length and width - we have added symmetry and meshing for
our heuristics these metrics are important.
Kyle Rose: Could these techniques be used by an attacker to discover weak
points in a network? A wide node could mean something for example. Could you
identify a specific load-balanced node and try to take it down because you've
identified it as a bottle-neck for example.
KV: I'm not into security, but maybe, i suppose so - the goal here was to make
a nice map so by merging all the traceroutes that we get indeed we see the core
of the network which may be useful to an attacker.
Tim Chown: I regularly use a package called perfsonar to measure loss, latency
and throuput between a large number of sites. It's trace task has no idea when
something changes where the change occurred. It's nice to have an algorithm
that helps distinguish between the endpoints, endpoint networks and the network
KV: The heuristics don't take into account where the load-balancing is taking
place but they could - it's a question that we've considered.
TC: Will follow up by email.
## An endhost-centric approach to detect network performance problems
Olivier Tilmans (OT)
Tim Chown: Is code available?
OT: If you want it, I can give it to you, but it's not yet published - I am
open to deploy elsewhere and compare results. Sharing dataset has privacy
implications for the students.
Lorenzo Colliti: Are you using cgroup eBPF filters?
OT: I am putting a tracing mechanism in the kernel attached to
LC: So you see socket calls not packets?
OT: I'm intercepting kprobes -
LC: What does this give you for quic? It shouldn't give you anything.
OT: You can do this in user space as well.
LC: You have to do something like either ld preload or whatever to attach
OT: Not even actually - we can set up the kernel such that any userspace or
kernel space call matching some symbol will have
LC: So your implementation only works for a given version of the quic binary?
OT: Yes that is a challenge we have. We know that students for example have
Chrmomium - if we want to move to another implementation we have to make it
compatible - need to define a better way/generic way
LC: It could break between chromium version 65 and 66.
OT: yes but we own machine so we own the version - we control everything
LC: Alright, thankyou.
Neal Cardwell: Wanted to offer a conjecture on some of the mysterious delay
spikes you're seeing. Some of the TCP RTTs you're seeing are suspiciously
similar to delayed ack values on common operating systems (200ms and 40ms). So
something to look at.
Theresa Enghardt: Where can i find your code?
OT: I can share privately but it's not published yet. If you want to
collaborate drop me an email.
## Closing remarks
DP: You may have noticed a difference between the last two talks and the
remainder of the agenda. This was deliberate to carve out space to have talks
about tooling and measurement strategies. In maprg we want to bring insight to
engineering and operation of protocols. We suggest that you can only come and
talk about a tool if you have a novel measurement for us.
MK: Thanks all for being on time and please give us feedback if there were
talks you liked, didn't like, or things you'd like to see in future.
Zesplot in five minutes - An attempt to visualise IPv6 address space (Luuk
Hendriks) Visualising IPv6 address space is a challenging exercise. While
approaches based on Hilbert curves have proven to be useful in the IPv4 space,
they end up producing uselessly large visualisations when applied to the IPv6
space. Inspired by the IPv6 Hackathon organized by the RIPE NCC in November
2017, our experimental tool Zesplot is an attempt to apply the idea of
so-called squarified treemaps  on IPv6 prefixes and addresses. Zesplot
produces plots based on two inputs: a list of prefixes, and a list of
addresses. The list of IPv6 prefixes is used to display squares, where the size
of the square reflects the size of the prefix. Then, the list of IPv6 addresses
is used to determine the colour of said squares: the more addresses are within
a certain prefix, the brighter that square is coloured. Thus, one can easily
spot outliers in an input set: a small but bright square for example, means
many 'hits' from a small prefix. Example use cases are visualisation of access
logs of e.g. webservers, origin of spam mail, or gaining insights in
measurement results for anything related to IPv6. Another possible use case is
education or address planning, where one can directly see the impact of
splitting up a prefix in different ways. Currently, Zesplot outputs to SVG with
an HTML/JS wrapper, allowing for zooming in/out on the plot, and providing
additional info (think ASN, number of addresses per prefix) while hovering the
squares. We are eager to learn what use cases are most useful for people, both
operators and researchers, to determine the direction for Zesplot. A first
version of the tool should be available under a permissive open source license
soon.  https://www.win.tue.nl/~vanwijk/stm.pdf
Measuring the quality DNSSEC deployment (Roland van Rijswijk-Deji)
In 2017 we performed two extensive studies of the DNSSEC ecosystem using
longitudinal data collected by the OpenINTEL active DNS measurement system
(https://openintel.nl/). Both studies focused on the quality of DNSSEC
deployments. In other words: if organisations bother to deploy DNSSEC, do they
deploy it in a secure way? We find that in generic TLDs, DNSSEC deployment is
low (1%). Fortunately, that 1% does mostly get it right; "real" errors in
DNSSEC deployment are rare. When we zoom in on two ccTLDs that have
incentivized DNSSEC deployment (.nl and .se), the picture is a bit more grim.
While errors are rare, deployments seldom follow best practices, leading to
potentially insecure DNSSEC deployment.
Update on client adoption for both TLS SNI and IPv6 (Erik Nygen)
With the exhaustion of IPv4, the multi-tenancy enabled by TLS SNI is critical
to supporting the rapid adoption of HTTPS. Over the past few years, TLS SNI
has gone from having insufficient adoption for being generally useful to being
viable in a majority of cases. IPv6 can also help here by not being
address-limited and has also seen solid growth in many countries. Akamai has
been closely tracking global adoption of both IPv6 and TLS SNI (and taking
steps to influence both) over the past few years. This talk will provide an
update on where the world is with end-user and client adoption for both TLS SNI
and IPv6, based on traffic statistics being collected from Akamai traffic
delivery. We will highlight both leaders and laggards, looking at areas that
can have the most leverage for increasing global adoption of both.
On the use of TCP's Initial Congestion Window in IPv4 and by Content Delivery
Networks (Jan Rüth) Paper “Large-Scale Scanning of TCP’s Initial Window”:
https://conferences.sigcomm.org/imc/2017/papers/imc17-final43.pdf Improving web
performance is fueling the debate over sizing TCP's initial congestion window
(IW). This debate yielded several RFC updates to recommended IW sizes, e.g., an
increase to IW10 in 2010. The current adoption of IW recommendations is,
however, unknown. First, we conduct large-scale measurements covering the
entire IPv4 space inferring the IW distribution size by probing HTTP and HTTPS.
We find that many relevant systems have followed the recommendation of IW10,
yet a large body of legacy systems is still holding on to past standards.
Second, to understand if standardization and research perspective still meet
Internet reality, we further study the IW configurations of major Content
Delivery Networks (CDNs) as known adaptors of performance optimizations. Our
study makes use of a globally distributed infrastructure of VPNs giving access
to residential access links that enable to shed light on network dependent
configurations. We observe that most CDNs are well aware of the IW's impact and
find a high amount of customization that is beyond current Internet standards.
Further, we find CDNs that utilize different IWs for different customers and
content while others resort to fixed values. We find various initial window
configurations, most below 50 segments yet with exceptions of up to 100
segments — the tenfold of current standards. Our study highlights that Internet
reality drifted away from recommended practices and thus updates are required.
A First Look at QUIC in the Wild (Jan Rüth)
Paper (author's version): https://arxiv.org/abs/1801.05168 For the first time
since the establishment of TCP and UDP, the Internet transport layer is subject
to a major change by the introduction of QUIC. Initiated by Google in 2012,
QUIC provides a reliable, connection-oriented low-latency and fully encrypted
transport. We provide the first broad assessment of QUIC usage in the wild. We
are monitoring the entire IPv4 address space since August 2016 and about 46% of
the DNS namespace to detected QUIC-capable infrastructures. As of October 2017
our measurements show that the number of QUIC-capable IPs has more than tripled
since then to over 617.59K. We find around 161K domains hosted on QUIC-enabled
infrastructure, but only 15K of them present valid certificates over QUIC. We
publish up to date data through: https://quic.comsys.rwth-aachen.de. Second, we
analyze over one year of traffic traces provided by MAWI, one day of a major
European tier-1 ISP and from a large IXP to understand the dominance of QUIC in
the Internet traffic mix. We find QUIC to account for 2.6% to 9.1% of the
current Internet traffic, depending on the vantage point. This share is
dominated by Google pushing up to 42.1% of its traffic via QUIC.
Adoption, Human Perception, and Performance of HTTP/2 Server Push (Torsten
Zimmerman) The web is current subject to a major protocol shift with the
transition to HTTP/2, that overcomes limitations of HTTP/1. For instance, it
now is a binary protocol that enables request-response multiplexing and
introduces Server Push as a new request model. While Push is regarded as key
feature to speed-up the web by saving unnecessary round-trips, the IETF
standard does not define its usage, i.e., what to push when. The goal of our
work is to inform standardization with an up-to-date picture on i) its current
usage, ii) its influence on user perception, and iii) optimization potential.
Our Push usage assessment is based on large-scale measurements  covering the
IPv4 and the complete set of .com/.net/.org domains. We regularly report our
results at https://push.comsys.rwth-aachen.de. We find both the HTTP/2 and the
Push adoption to steadily increase, yet Push usage is orders of magnitudes
lower than HTTP/2, highlighting its complexity to use (e.g., 220K domains on
the Alexa 1M support HTTP/2 and only 932 Push). Second, our performance
evaluation of Push enabled sites shows that Push can both speed-up and
slow-down the web . These detrimental effects cannot be simply attribute
to simple factors like type, size, or fraction of pushed objects, again
highlighting the complexity to use push correctly. We assessed if these effects
are user perceivable in a user study , i.e., to assess if current
engineering and standardization efforts are indeed sufficient to optimize the
Web. Server Push can yield human-perceivable improvements, but also lead to
impairments. Notably, these effects are highly website specific and indicate
that finding a generic strategy is challenging. Our ongoing work studies how to
better use push. We thus thoroughly analyze Push performance impacts in a
controlled and isolated testbed. Based on these results and the previous
contributions, we investigate a novel approach to realize Server Push,
incorporating website specific knowledge and client-side aspects, that can lead
to improvements for some websites. We believe that our work can help to
understand how standardized features are applied in the wild and what are the
Inferring BGP Blackholing Activity in the Internet (Georgios Smaragdakis)
The Border Gateway Protocol (BGP) has been used for decades as the de facto
protocol to exchange reachability information among networks in the Internet.
However, little is known about how this protocol is used to restrict
reachability to selected destinations, e.g., that are under attack. While such
a feature, BGP blackholing, has been available for some time, we lack a
systematic study of its Internet-wide adoption, practices, and network
efficacy, as well as the profile of blackholed destinations. In this
presentation we describe how we develop and evaluate a methodology to
automatically detect BGP blackholing activity in the wild. We apply our method
to both public and private BGP datasets. We find that hundreds of networks,
including large transit providers, as well as about 50 Internet exchange points
(IXPs) offer blackholing service to their customers, peers, and members.
Between 2014-2017, the number of blackholed prefixes increased by a factor of
6, peaking at 5K concurrently blackholed prefixes by up to 400 Autonomous
Systems. We assess the effect of blackholing on the data plane using both
targeted active measurements as well as passive datasets, finding that
blackholing is indeed highly effective in dropping traffic before it reaches
its destination, though it also discards legitimate traffic. We augment our
findings with an analysis of the target IP addresses of blackholing. We also
show that BGP blackholing correlates with periods of high activity of DDoS
attacks. Our tools and insights are relevant for operators considering offering
or using BGP blackholing services as well as for researchers studying DDoS
mitigation in the Internet.
Deploying MDA Traceroute on RIPE Atlas Probes (Kevin Vermeulen)
Traceroute is widely used by network operators, to troubleshoot, and by
scientists, to understand the topology of the internet. In the presence of load
balancing, classic traceroute can lead to errors and misinterpretations, and
these have been corrected by the widely used Paris Traceroute, which we have
developed and maintain. Paris Traceroute's Multipath Detection Algorithm (MDA)
allows a user to discover the load balanced paths between a source and a
destination, with configurable statistical guarantees on the completeness of
the results. The more complex these topologies are, the more packets the MDA
requires in order to provide the guarantees. For a single route trace, the
numbers can run into the thousands of packets, and even tens of thousands. In a
resource constrained environment, such as a RIPE Atlas probe, this is too
costly. We have made an empirical study of the patterns in which load balancers
tend to reveal themselves in route traces, and we are using the results to
implement and deploy a new MDA, which promises to significantly reduce the
number of probe packets required to discover complete multipath routes. We
describe ongoing work that has been presented last week at CAIDA's AIMS
workshop. In order to reduce the number of probe packets required to discover
complete multipath routes, we have done a survey on load balancers, consisting
in probing 350000 ip destinations from 35 planet lab node as sources, and
extract some metrics. This talk will mainly focus on the results we have found
in the survey, and the metrics we have extracted to classify the load-balancers
we have found..
An endhost-centric approach to detect network performance problems (Olivier
Tilmans and Olivier Bonaventure) As enterprises increasingly rely on cloud
services, their networks become a vital part of their daily operations. Many
enterprise networks use passive measurements techniques and tools, such as
Netflow. However, these do not allow to estimate Key Performance Indicators
(KPIs) of connections, for example losses or delays. Although monitoring
functions on routers or middleboxes can be convenient from a deployment
viewpoint, they miss a lot of information about performance problems as they
need to infer the state of each connection and they will become less and less
useful as encrypted protocols are getting deployed (e.g., QUIC encrypts
transport headers). It is time to revisit the classical approaches to network
monitoring and exploit the information available on the end hosts. In this
talk, we propose a new monitoring framework where monitoring daemons directly
instrument end-hosts and export KPIs about the different transport protocols
towards an IPFIX collector. More specifically, our monitoring daemons insert at
runtime lightweight probes in the native transport stacks (e.g., the Linux
kernel TCP stack, libc’s name resolution routines, QUIC implementations) to
extract general statistics from the state maintained for each connection. An
aggregation daemon analyzes these statistics to detect events (e.g., connection
established, RTOs, reordering) and exports KPIs towards an IPFIX collector. We
will present a prototype deployment of these monitoring daemons in a campus
network, and discuss early measurement results.