# Privacy Enhancements and Assessments Research Group (PEARG) Agenda - IETF 113

## Presentations

### "On the ineffectiveness of QUIC PADDING against Website Fingerprinting" - Sandra Siby (25 mins)

Joint work by EPFL and Cloudflare
Concerned with adversary between client and destination site, who is trying to determine which website a client is visiting based on traffic metadata. Assumes encryption of DNS and TLS client hello.
Assumes classifier trained on various websites (uses timing, size, directionality of HTTP traffic)
Attacker is working with middleboxes located in an AS that the data transits through
Trying to detect which website is used based on all websites hosted by a given IP

Previous IETF 111 presentation concluded that QUIC is no harder to fingerprint than TCP.
Question is if it is possible to defend against such an adversary, such as by using QUIC PADDING frames.

Undefended traffic against an unconstrained adversary (all traffic seen) has a 96% score for being able to be fingerprinted. This naive cases uses the size of packets as a major role.
Padding all packets to the same length lowers the score to 94%
Hiding trace-based features, padding total size, only takes it to 92%
The classifier at this point uses the directionality (packets in each direction, timing) to track.
Injecting "dummy" packets reduces the effectiveness. To reduce fingerprinting by 10%, it requires 50% overhead of dummy packets.

Some adversaries are more constrained in terms of what they see and what they do.

Tracking timing to Google alone shows ~70% fingerprintability.

A network-layer defence that doesn't know the set of patterns it is trying to imitate to effectively use padding is less effective.
For example, many sites have a high mix of first- and third-party resources. All parties must be participating in the padding/protection
to avoid fingerprinting. If that is done, padding is still only 16% effective, but 5 dummies being injected is 39% effective.

Sharon Barkai: If we define the adversary as the client sending malware, and the sampling entity is the protection software trying to detect a rogue client, the only way the malware can throw off the detection sampler is to inject dummy traffic?

Sandra: The conclusion is that most of defenses against fingerprinting are not that effective, so the detection software would generally be effective.

Nick Doty: To clarify the threat, you are removing all caches and are making a load to the index page of the popular domain names. And you're trying to detect which webpage is being loaded?

Sandra: Correct.

Nick: How does this apply to cases where I am not going to the index page, or I have a custom resource, or I have cached resources. If we're worried about them learning what I'm reading or my actual content, does the threat apply to these cases?

Sandra: We are working with relatively clean traces. We're planning on doing experiments with subpages to see how well the attack works. What we presented is the best case for the adversary.

### "GDPR and Network Privacy" - Luigi Iannone (20 mins)

Looking at how GDPR and IP address privacy intersect.
GDPR came into effect in 2018. Made up of articles (law) and recitals (notes about how to apply)
Concerned with "personal data" about a human person. Discusses "controllers", "processors", and "processing" of data.

GDPR defines "online identifiers" like IP addresses as personal data, even for temporary addresses.

ISP is allowed to collect data to offer the connectivity service, but not more than that.

In Japan, even anonymized versions of the data are considered personal. In GDPR, anonymized collection is allowed.

Patrick Tarpey: In the EU area, there's the "data retention directive" which requires keeping data records for lawful intercept. To what extent to cases like MASQUE and oblivious technology render this obsolete.

Luigi: These technologies are very useful for privacy protection. The question is where the data goes. Oblivious methodologies that obfuscate data don't automatically make something GDPR compliant — it still matters what logs you keep.

## Drafts

### [A Survey of Worldwide Censorship Techniques](https://datatracker.ietf.org/doc/draft-irtf-pearg-censorship/) - Mallory Knodel (10 mins)

Started in 2014! Went through RGLC recently. -04 document has the following changes:
- Self censorship scaled back
- Domain seizure added
- TLS 1.3 extensions discussion
More issues fixed in -05.

Draft covers *what* is blocked, *detection* of what to block, *how* to block, and the network structure.

Suggesting to drop/close two of the open issues (#62 and #64 (is this the right issue number?))

Asking for another RGLC after incorporating outstanding issues.