Minutes IETF99: maprg
minutes-99-maprg-04

Meeting Minutes Measurement and Analysis for Protocols (maprg) RG
Title Minutes IETF99: maprg
State Active
Other versions plain text
Last updated 2017-08-09

Meeting Minutes
minutes-99-maprg

   Measurement and Analysis for Protocols Research Group (maprg) - IETF-99 (Prague)

Intro & Overview, Project "Advertisements"  - Mirja Kuhlewind & Dave
Plonka --- Agenda is combination of: -Reporting on nascent measurement results
-Call to arms for new projects spinning up

Chairs keen to hear feedback on how the agenda has been structured.

Lars Eggert: We had ANRW workshop on Saturday for second time, next summer will
be in Montreal. There will be a CFP etc. This year we didn't realise conflict
with IMC deadline - will try to avoid next year. Please keep in mind if you
have something relevant to publish. If you are a North American academic and
would like to be considered for TPC chair, let us know.

Dave Plonka: Where to send comments and feedback on ANRW?

Lars: irtf-discuss list.

IPv6 Reluctanct Devices and Applications - Mikael Abrahamsson
---
Abstract: There are numerous reports of consumer products and devices that
either (a) don't support IPv6 yet or (b) ostensibly support IPv6 in that they
configure an IPv6 address, but subsequently seem not to use it, perhaps due to
their Happy Eyeballs implementation. We request that a measurement effort be
undertaken to identify and report on this behavior, so that we can mitigate
this impediment to IPv6[-only] operation.

Dave: This is something we're interested in - can we proactively do something
with measurement requests from IETF?

Fred Baker: Root operator looking at IPv6 deployment suggested that we go for
interop test where we can prove that devices do IPv6-only. So people would get
the message in a positive manner. Is this useful?

Mikael: Would this be for apps? Devices?

Fred: Any kind of device. Cisco is working with customers on IPv6-only
deployment. Several networks too. So you're extending this to the reach of
other places.

Mikael: Apple did NAT64 testing - IPv6-only WiFi - one way of testing. At least
then you know that device and application can speak IPv6 - spreading that
approach into more places would be good.

Tim Winters (UNH-IOL): Involved the program that monitor device IPv6 stacks. No
smart TVs though - coverage not complete. We can configure device to turn on
IPV6, but not necessarily on by default - could make that a requirement in
future. Testing we do shows more application testing in the last year in
IPv6-only environments - starting to see movement in that direction from
vendors.

Mikael: Would love to see IPv6-default-on be a requirement.

Tim Winters (UNH-IOL): People don't always want it on all the time - I'll take
an action to press on this.

Mikael: When people say 'market doesn't require IPv6' I say market requires
INternet access - today that is both stacks with default to IPv6-on.

Fred: Have preso later today on challenges of enabling IPv6

Dave: Interested in ideas from the group about how to conduct this measurement
study. More discussion on the list, and if we have something to share by the
next meeting that would be good.

Fingerprint-based detection of DNS hijacks using RIPE Atlas - Pawel Foremski
---
Abstract: The DNS protocol does not protect from hijacking resolver traffic. In
fact, some network operators and government agencies transparently redirect DNS
queries made from end-user devices to popular recursive resolvers to their own
servers. This effectively allows the hijacking party to easily monitor, block,
and manipulate the content published to the World Wide Web, and in general to
control its Internet users. In this paper, we present a novel fingerprinting
algorithm that is able to detect hijacking of recursive DNS resolver traffic.
By sending specific DNS queries to the resolver and analyzing the replies using
machine learning algorithms, we are able to classify the resolver as legitimate
or rogue. We evaluate our technique through the RIPE Atlas platform, collecting
DNS measurements on Google Public DNS and OpenDNS servers from 9,500 probes
distributed around the world.

Aaron Falk: Encourage you to publish this so I can share a link to your work.
Surprised to see that RTT was on there given use of anycast for DNS. Was
anycast a factor in your measurements? Did you look at the contents of the
response - just interception or data manipulation?

Pawel: Regarding anycast, as far as I know both Google and OpenDNS are anycast
operators so couldn't compare. Regarding replies we haven't had time yet to
analyse - that is future work.

Colin Perkins: Did you do anything to infer the reason for the hijacking?

Pawel: Yes, interesting - that is future work. Hard to think of source of
information to answer this question - motivation often hard to infer.

Jana Iyengar: This is great work, please publish. Have you considered calling
some of these ISPs and just talking to them?

Pawel: Yes, we are considering that - but it would be future work.

Jana: We've done some of that and found that in some cases they are no clueful
about the implications of what they are doing and willing to learn.

Giovane Moura: Root DDoS attacks in 2015 - can download data and noticed that
hijacked probes had very short RTT - if you just use public measurements with
standardised queries - difference between your method and that would be
interesting. Does your method show more hijacks?

Pawel: That's interesting. We would need to compare the measurements. Data is
open. Would be easy to run more meaurements.

??: How did you split your data set into training and testing? Did you perform
any cross validation?

Pawel: 30 times randomly, so splitted equally. No cross validation yet but on
our track.

Aaron Falk channeling jabber (Simon Ferlin): Would be interesting to see user
experience for hijacked probes?

Pawel: Definitely but how would you collect this data? May crowd-sourcing?

Rate-limiting of IPv6 traceroutes is widespread: measurements and mitigations -
Pablo Alvarez --- Abstract: With IPv6, high-frequency traceroutes show many
more missing hops than IPv4. This appears to be due to the fact that RFC 4443
states IPv6 routers MUST rate-limit the ICMPv6 error packets that traceroute
and similar programs depend on to determine hops on a route. We measured the
characteristics of this rate-limiting from many vantage points to thousands of
targets across continents. We find that about 2/3 of all routers exhibit
rate-limiting at frequencies < 100Hz. The distribution of the data suggest
most of these rate-limits might be factory defaults. We discuss strategies to
overcome this limitation, including altering the order of packets across single
or multiple traceroutes, and merging information from traceroutes to different
targets.

Mirja: Are there any defaults in the RFC?

Pablo: No.

Jen Linkova: Problem here is that there is default value that you could change.
But hardware cannot be changed. Don't know how many ICMP messages router is
sending to other hosts. Could also be affected by neighbour discovery - I'm not
surprised to see IPv6 is worse here. Changing defaults may not be possible due
to hardware.

Mikael Abrahamsson: Did you try changing source addresses.

Pablo: Did quick and dirty test - looks like limits are per router, not per
source.

Mikael: I don't think you can make that leap - could be per line card, there
might be several of these. Did you differentiate between getting between 0, 1,
2 or 3 responses for a router?

Pablo: Only testing each hop once.

Mikael: Would be interesting to see what happened if you did send more.

Andrew McGregor: Having worked on a few IPv6 router implementations it's likely
there are higher limits out there. You're likely to find that despite CPU being
fast, there isn't enough PPS available between control and forwarding planes to
get limits higher - you find limits in exotic places inside router chassis. Yes
it's usually linecard CPU doing this.

Pablo: There are additional rate limits beyond token bucket?

Aandrew: possibly, or limit is not token bucket, but some PPS limit inside
router implementation - e.g. for all control plane packets.

Pablo: The data I got are consistent in most cases with a token bucket.

John Brzozowski: It would be interesting to run this internally within a large
provider network. Addressing methodology could prevent getting a proper
response - maybe links aren't using globally routable addresses.

Pablo: IPv4 traces often include RFC1918 addresses. Didn't filter out
non-globally routable addresses. Can look at data more for this.

Mirja: Did you just say you want to conduct some measurements and make the
results public?

John: I could do that.

Andrew ??: limit may be a hardware limit, e.g. if slides are used. You will hit
limits, ICMP are usually software imposed limits, talking to vendors could help
find a way to conduct this testing without difficulties like this. It's really
old hardware that you would have a concern with.

Olivier Bonaventure: Is the data that you collect publicly available?

Pablo: Not sure - talk to me offline.

Olivier: Public data projects (CAIDA etc.) help to incentivise future
researchers to make data public, and use existing traceroute tools instead of
writing their own tools to get private data.

kIP: a Measured Approach to IPv6 Address Anonymization - David Plonka
---
Abstract: Related pre-print, "kIP: a Measured Approach to IPv6 Address
Anonymization" (Plonka & Berger)Privacy-minded Internet service operators
anonymize IPv6 addresses by truncating them to a fixed length, perhaps due to
long-standing use of this technique with IPv4 and a belief that it's "good
enough." We claim that simple anonymization by truncation is suspect since it
does not entail privacy guarantees nor does it take into account some common
address assignment practices observed today. To investigate, with standard
activity logs as input, we develop a counting method to determine a lower bound
on the number of active IPv6 addresses that are simultaneously assigned, such
as those of clients that access World-Wide Web services. In many instances, we
find that these empirical measurements offer no evidence that truncating IPv6
addresses to a fixed number of bits, e.g., 48 in common practice, protects
individuals' privacy. To remedy this problem, we propose kIP anonymization, an
aggregation method that ensures a certain level of address privacy. Our method
adaptively determines variable truncation lengths using parameter k, the
desired number of active (rather than merely potential) addresses, e.g., 32 or
256, that can not be distinguished from each other once anonymized. We describe
our implementation and present first results of its application to millions of
real IPv6 client addresses active over a week's time, demonstrating both
feasibility at large scale and ability to automatically adapt to each network's
address assignment practice and synthesize a set of anonymous aggregates
(prefixes), each of which is guaranteed to cover (contain) at least k of the
active addresses. Each address is anonymized by truncating it to the length of
its longest matching prefix in that set.

Tim Chown: Don't think you can tell identifiers apart based on randomisation.
Sometimes you get new IID, sometimes you don't depending on address class.

Dave: 3rd column is stability of address. Can use sliding time window and test
whether I've ever seen that value before. I'm saying this works if
temp/slaac/privacy address mechanism - we can reject persistent pseudorandoms.

Alex ???: We can apply to other areas, e.g. rate-limiting for services. Would
be great if in some way we could take this work into IETF to make a protocol
where I could identify suitable prefix lengths for anonymization. European data
legislation is also relevant now.

Dave: agree. are results portable - if place you're porting them from has
better visibility than where you are.

Alex: Can we publishing results and aggregating them? To know where to cut the
prefix.

Dave: Integrate observations in some private way. There are methods that allow
you to share addresses in an opaque way.

Mirja: Guidance document that we should take on as a group.

Dave: Don't know where in the IETF community to talk about such a thing. Could
be candidate for a draft in this group initially.

Pablo Alvarez: In terms of re-use, do you have some idea of temporal stability.
Do ISP properties change over time? How often would you need to re-new and
share?

Dave: Don't know, but you're right to be concerned. It works well for offline
analysis. We need to do more work here.

Mikael Abrahamsson: Anti-abuse handling in IETF would be interestd if you can
identify prefix lengths allocated by ISPs. Where is the household level of
aggregation. RIRs have a way of letting ISPs publish this, but nobody does.
Measurement would help.

Dave: We can discover, or they can tell us - would love to compare the two.

Measuring Latency Variation in the Internet - Toke Hoiland-Jorgensen
---
Abstract: Related paper, "Measuring Latency Variation in the Internet" (Toke
Hoiland-Jorgensen et al., CoNEXT '16)We analyse two complementary datasets to
quantify the latency variation experienced by internet end-users: (i) a
large-scale active measurement dataset (from the Measurement Lab Network
Diagnostic Tool) which shed light on long-term trends and regional differences;
and (ii) passive measurement data from an access aggregation link which is used
to analyse the edge links closest to the user. The analysis shows that
variation in latency is both common and of significant magnitude, with two
thirds of samples exceeding 100 ms of variation. The variation is seen within
single connections as well as between connections to the same client. The
distribution of experienced latency variation is heavy-tailed, with the most
affected clients seeing an order of magnitude larger variation than the least
affected. In addition, there are large differences between regions, both within
and between continents. Despite consistent improvements in throughput, most
regions show no reduction in latency variation over time, and in one region it
even increases. We examine load-induced queueing latency as a possible cause
for the variation in latency and find that both datasets readily exhibit
symptoms of queueing latency correlated with network load. Additionally, when
this queueing latency does occur, it is of significant magnitude, more than 200
ms in the median. This indicates that load-induced queueing contributes
significantly to the overall latency variation.

Mikael Abrahamsson: Is same TCP congestion avoidance algorithm used throughout
measurement period?

Toke: Good questions. Don't know.

Andrew McGregor: MLabs server side TCP hasn't changed over this period. Client
side may have.

Mikael: TCP Window scaling might also have changed over this time span.

Mirja: Did you try to detecet presence of AQM?

Toke: No, not directly.

Mirja: Data is from 2015? Most interesting data would be 2015-2017.

Toke: Think it would be straightforward to re-run this on a new dataset.

Mirja: Please share dataset and code with list.

Vaibhav Bajpai: Caveat - measurements towards MLabs - may not be 'normal'.

Toke: Not speedtest.net.

Vaibhav: That's not normal traffic either - traffic to facebook, google, etc.
is 'normal'.

Qiabong Xie (Netflix): how big was total data set?

Toke: 200M - we erred on the side of minimising false positives, hence ended up
with 5M flows.

Qiabong Xie (Netflix): if you see 80% of flow that see congestion it might not
be sender induced.

Toke: some flows do not get a single congestion event at all over 10 seconds,
but latency increases quite a lot.

Qiabong Xie (Netflix):: percentage means that rest of the flows don't have this
pattern - likely cross-traffic induced, not sender induced.

Toke: could be - i wouldn't say that none of the other flows didn't have
sender-induced congestion events, but certainly for a lot of these flows that
is true.

Garret Tyson:: How did MLAb evolve during your measurement period? for example
africa has more measurement servers now than previously.

Toke: that's true.

Measuring YouTube Content Delivery over IPv6 - Vaibhav Bajpai
---
Abstract: To appear in the SIGCOMM Computer Communication Review (CCR), July
issue. We measure the performance of YouTube over IPv6 using ~100 SamKnows
probes connected to dual-stacked networks representing 66 different origin
ASes. Using a 29-months long (Aug 2014 - Jan 2017) dataset, we show that
success rates of streaming a stall-free version of a video over IPv6 have
improved over time. We show that a Happy Eyeballs (HE) race during initial TCP
connection establishment leads to a strong (more than 97%) preference over
IPv6. However, even though clients prefer streaming videos over IPv6, we
observe worse performance over IPv6 than over IPv4. We witness consistently
higher TCP connection establishment times and startup delays (~100 ms or more)
over IPv6. We also observe consistently lower achieved throughput both for
audio and video over IPv6. We observe less than 1% stall rates over both
address families. Due to lower stall rates, bitrates that can be reliably
streamed over both address families are comparable. However, in situations,
where a stall does occur, 80% of the samples experience higher stall durations
that are at least 1s longer over IPv6 and have not reduced over time. The worse
performance over IPv6 is due to the disparity in the availability of Google
Global Caches (GGC) over IPv6. The measurements performed in this work using
the youtube test and the entire dataset is made available to the measurement
community.

Mo Boucadair: There is no recommendation about 300ms in the RFC. It's up to the
implementors to choose timers. Formally there's no update to the RFC required.

Vaibhav: Correct.

Dave Wilson: Interesting that stall rates, bit rates the same - would user
really see a difference? Not sure that latency is that important for video.

Vaibhav: Startup delay is also worth looking at.

Dave Wilson: A lot of people think measuring latency is a guide to customer
experience - but your results suggest otherwise.

Mikael Abrahamsson: Would be interesting to get the diagrams for the last 12
months of data to see what the current state is vs. what was happening in 2013.
Don't want historical problems to impact conclusion about what to do now.

Jana Iyengar: regarding dip on slide 8 - is client running TCP?

Vaibhav: Yes that is TCP.

Jana: one of the graphs in my presentation coming up next is going to coincide
exactly with that dip and i don't know why.

Ian Swett: Did you compare IPv4-only vs. Happy Eyeballs?

Vaibhav: We run a test over IPv6 then over IPv4. Almost at same time.

Ian: GGC nodes that don't speak IPv6 are just ignored from sample?

Vaibhav: Yes, we have to.

Ian: Results are surprising to me, but I will make some enquiries.

Mikael Abrahamsson: If closest GCC node is IPv4 only, then IPv6 tests will be
over a different path.

Vaibhav: Would be interested in expansion of dual-stacked GGC nodes.

Giovane Moura: are earlier measurements comparable with this?

Vaibhav: 2015 work was latency towards websites in general, this work is
specific to YouTube.

Jen Linova: Would be interested to see difference between when nearest GGC node
is dual-stack vs. when nearest node is IPv4-only.

Vaibhav: Yes we are working on this now.

Jen: Onus is on operators to provide IPv6 connectivity - it's not up to Google
to provision IPv6 on GGC nodes - they'll be dualstack when on dualstack
networks.

Vaibhav: Understood, my mistake.

The QUIC Transport Protocol: Design and Internet-Scale Deployment - Jana Iyengar
---
Abstract: QUIC is an encrypted, multiplexed, and low-latency transport protocol
designed from the ground up to improve transport performance for HTTPS traffic
and to enable rapid deployment and continued evolution of transport mechanisms.
QUIC has been globally deployed at Google on thousands of servers and is used
to serve traffic to a range of clients including a widely-used web browser
(Chrome) and a popular mobile video streaming app (YouTube). We estimate that
7% of Internet traffic is now QUIC. This talk will cover the Internet-scale
process that we used to perform iterative experiments on QUIC, performance
improvements seen by our various services, and our experience deploying QUIC
globally.

Mirja: Which connection control does this data use?

Jana: Cubic with two connection emulation because TCP uses to connection.

Andrew Doganow: Now QUIC is by default enabled in Chrome. I live in Singapore
where you're not helping; you making it slower because slow connections is e.g.
1 Gig or 200MB/s. What we're seeing is some of the infrastructure is blocking
QUIC for various 'security' reasons. Are you looking at making it more
geo-aware?

Jana: First thing is that it shouldn't hurt. If operators are blocking it,
that's fine - Chrome will use TCP.

Andrew D.: Fallback delay is what I'm talking about - it still works, but
addtional delay is noticeable.

Jana: Expect that QUIC will be more feature-rich in future, e.g. encryption
integrated. Would like to better understand what delays are that you are
seeing. We are working on issues where middleboxes will drop QUIC traffic after
the handshake.

Colin Perkins: Are you able to break down performance results based on TCP
version running at the receiver. Do you win out against all versions fo TCP, or
do you not have the data to tell?

Jana: Don't have that data. There's some other micro-benchmarking data. Chrome
user base is mostly Windows, so you are asking which version of Windows. Chrome
distribution data would help - might be able to get that.

Mikael Abrahamsson: Are you doning packet-level path MTU discovery?

Jana: Did a bunch of experiments and fixed at 1350 bytes for all of our clients

Martin Gunnarsson: Were mobile clients connected over wifi or cellular?

Jana: Don't have that data right now.

Jana: Please read the paper. There's a fun story about middlebox ossification;
believe it or not QUIC already ossifying.

Mirja: Thanks to all our speakers. If you have data, send it to the mailing
list. If you would like to present at our next meeting the earlier you can let
the chairs know the better.