Measurement and Analysis for Protocols Research Group (maprg) Agenda at IETF-106 (Singapore)

Date: Monday Nov 18, 18:10-19:10 (Afternoon session III)
Room: Padang

Overview & Status

Mirja Kühlewind
2 min

QUIC Deployment Update

Mirja
3 min

Losses in SATCOM systems: identification and impact

Nicolas Kuhn
10 min

TLS Beyond the Browser: Combining End Host and Network Data to Understand Application Behavior

Anderson Blake
15 min

A Look at the ECS Behavior of DNS Resolvers

Kyle Schomp (remote)
15 min

Characterizing JSON Traffic Patterns on a CDN

Santiago Vargas (remote)
15 min


Abstracts

Losses in SATCOM systems: identification and impact (Nicolas Kuhn)

This talk focuses on losses identification and impact on SATCOM end-to-end systems. We present three ways of assessing loss presence in SATCOM access: on the Wi-Fi link, in before the bottleneck and end-to-end. Despite having a quite reliable satellite access, most of the losses can be seen before the bottleneck (before the satellite link). This talk also presents the impact of losses on end-to-end protocols in such systems and discussing available solutions.

TLS Beyond the Browser: Combining End Host and Network Data to Understand Application Behavior (Blake Anderson and David McGrew (Cisco Systems))

Paper (PDF): http://delivery.acm.org/10.1145/3360000/3355601/p379-Anderson.pdf
The Transport Layer Security (TLS) protocol has evolved in response to different attacks and is increasingly relied on to secure Internet communications. Web browsers have led the adoption of newer and more secure cryptographic algorithms and protocol versions, and thus improved the security of the TLS ecosystem. Other application categories, however, are increasingly using TLS, but too often are relying on obsolete and insecure protocol options, as we found through a study of applications that use TLS at global enterprises. To understand in detail what applications are using TLS, and how they are using it, we developed a novel system for obtaining process information from end hosts and fusing it with network data to produce a TLS fingerprint knowledge base. This data has a rich set of context for each fingerprint, is representative of enterprise TLS deployments, and is automatically updated from ongoing data collection. Our dataset is based on 96 million endpoint-labeled and 2.4 billion unlabeled TLS sessions obtained from enterprise edge networks in five countries, plus millions of sessions from a malware analysis sandbox. We actively maintain an open source dataset that, at 2,200+ fingerprints and counting, is both the largest and most informative ever published. In this paper, we use the knowledge base to identify trends in enterprise TLS applications beyond the browser: application categories such as storage, communication, system, and email. We study fingerprint prevalence, longevity, and succession across application versions, and identified a rise in the use of TLS by non-browser applications and a corresponding decline in the fraction of sessions using version 1.3. Finally, we highlight the shortcomings of na\"{i}vely applying TLS fingerprinting to detect malware, and we present recent trends in malware's use of TLS such as the adoption of cipher suite randomization.

A Look at the ECS Behavior of DNS Resolvers (Rami Al-Dalky and Michael Rabinovich (Case Western Reserve University), Kyle Schomp (Akamai Technologies))

Paper (PDF): http://delivery.acm.org/10.1145/3360000/3355586/p116-Al-Dalky.pdf
Content delivery networks (CDNs) commonly use DNS to map end-users to the best edge servers. A recently proposed EDNS0-Client-Subnet (ECS) extension allows recursive resolvers to include end-user subnet information in DNS queries, so that authoritative nameservers, especially those belonging to CDNs, could use this information to improve user mapping. In this paper, we study the ECS behavior of ECS-enabled recursive resolvers from the perspectives of the opposite sides of a DNS interaction, the authoritative nameservers of a major CDN and a busy DNS resolution service. We find a range of erroneous (i.e., deviating from the protocol specification) and detrimental (even if compliant) behaviors that may unnecessarily erode client privacy, reduce the effectiveness of DNS caching, diminish ECS benefits, and in some cases turn ECS from facilitator into an obstacle to authoritative nameservers' ability to optimize user-to-edge-server mappings.

Characterizing JSON Traffic Patterns on a CDN (Santiago Vargas and Aruna Balasubramanian (Stony Brook University), Moritz Steiner and Utkarsh Goel (Akamai))

Paper (PDF): http://delivery.acm.org/10.1145/3360000/3355594/p195-Vargas.pdf
Content delivery networks serve a major fraction of the Internet traffic, and their geographically deployed infrastructure makes them a good vantage point to observe traffic access patterns. We perform a large-scale investigation to characterize Web traffic patterns observed from a major CDN infrastructure. Specifically, we discover that responses with 'application/json' content-type form a growing majority of all HTTP requests. As a result, we seek to understand what types of devices and applications are requesting JSON objects and explore opportunities to optimize CDN delivery of JSON traffic. Our study shows that mobile applications account for at least 52% of JSON traffic on the CDN and embedded devices account for another 12% of all JSON traffic. We also find that more than 55% of JSON traffic on the CDN is uncacheable, showing that a large portion of JSON traffic on the CDN is dynamic. By further looking at patterns of periodicity in requests, we find that 6.3% of JSON traffic is periodically requested and reflects the use of (partially) autonomous software systems, IoT devices, and other kinds of machine-to-machine communication. Finally, we explore dependencies in JSON traffic through the lens of ngram models and find that these models can capture patterns between subsequent requests. We can potentially leverage this to prefetch requests, improving the cache hit ratio.