[{"author": "Sara Dickinson", "text": "

Yay! Thanks again to all who worked on those RFCs

", "time": "2023-07-28T16:33:17Z"}, {"author": "Nick Doty", "text": "

/me not getting much audio from speaker

", "time": "2023-07-28T16:36:53Z"}, {"author": "Daniel Gillmor", "text": "

same in the room

", "time": "2023-07-28T16:37:00Z"}, {"author": "Daniel Gillmor", "text": "

maybe turn off video?

", "time": "2023-07-28T16:37:06Z"}, {"author": "Alexandr Railean", "text": "

Reza, try stopping the video stream.

", "time": "2023-07-28T16:37:06Z"}, {"author": "David Oran", "text": "

and video is frozen too

", "time": "2023-07-28T16:37:10Z"}, {"author": "Watson Ladd", "text": "

Blame avcore :-P

", "time": "2023-07-28T16:37:33Z"}, {"author": "Nick Doty", "text": "

if he could upload slides -- oh, it's already in the list!, we wouldn't need his bandwidth to send the screen video

", "time": "2023-07-28T16:38:08Z"}, {"author": "Shivan Sahib", "text": "

he did

", "time": "2023-07-28T16:38:52Z"}, {"author": "Nick Doty", "text": "

is this even considered an attack? it seems like the intended functionality of the large language models that they reflect substance from the training data

", "time": "2023-07-28T16:44:17Z"}, {"author": "Jonathan Hoyland", "text": "

The creators of the model might not care, but the victims of the leakage might.

", "time": "2023-07-28T16:46:04Z"}, {"author": "Nick Doty", "text": "

Jonathan Hoyland said:

\n
\n

The creators of the model might not care, but the victims of the leakage might.

\n
\n

absolutely, I think it's a serious privacy risk! I just mean, it seems like the intended functionality of the system itself, so it's likely the system is not trying to protect against it

", "time": "2023-07-28T16:47:02Z"}, {"author": "Randy Bush", "text": "

the victims are already suing

", "time": "2023-07-28T16:47:04Z"}, {"author": "Watson Ladd", "text": "

Reflect substance is not the same as exact quoting. It's different from e.g. giving a human the docs where they won't repeat it all but learn from the gist

", "time": "2023-07-28T16:47:23Z"}, {"author": "Daniel Gillmor", "text": "

This framing appears to make an assumption that some data is \"my data\" and other data is \"not my data\". it's not clear that this is how data works.

", "time": "2023-07-28T16:47:24Z"}, {"author": "Martin Thomson", "text": "

So IND-KTD

", "time": "2023-07-28T16:47:27Z"}, {"author": "Daniel Gillmor", "text": "

audio getting choppy again

", "time": "2023-07-28T16:47:38Z"}, {"author": "Martin Thomson", "text": "

Does the adversary know the data?

", "time": "2023-07-28T16:47:41Z"}, {"author": "Kyle Hogan", "text": "

Nick Doty said:

\n
\n

is this even considered an attack? it seems like the intended functionality of the large language models that they reflect substance from the training data

\n
\n

The main point of a lot of Nicholas Carlini's work is to challenge the concept that because something is aggregated (e.g. in a large language model) then it is also private. This was kinda obvious to the theorists, but it was _not_ obvious to the people who build these models.

", "time": "2023-07-28T16:47:57Z"}, {"author": "Randy Bush", "text": "

perhaps the speaker should not send his face video.

", "time": "2023-07-28T16:48:14Z"}, {"author": "Martin Thomson", "text": "

Indistinguishability with known training data

", "time": "2023-07-28T16:48:20Z"}, {"author": "Martin Thomson", "text": "

When I first saw the slides, I was thinking differential privacy, but this is not that weak.

", "time": "2023-07-28T16:49:25Z"}, {"author": "Nick Doty", "text": "

Kyle Hogan said:

\n
\n

Nick Doty said:

\n
\n

is this even considered an attack? it seems like the intended functionality of the large language models that they reflect substance from the training data

\n
\n

The main point of a lot of Nicholas Carlini's work is to challenge the concept that because something is aggregated (e.g. in a large language model) then it is also private. This was kinda obvious to the theorists, but it was _not_ obvious to the people who build these models.

\n
\n

some systems try to provide aggregate data, but I'm not sure large language models are intended to aggregate and hide the individual data. don't the designers want the model to be able to answer questions about specifics in the training set?

", "time": "2023-07-28T16:49:33Z"}, {"author": "Daniel Gillmor", "text": "

is the adversary just trying to figure out whether the victim is present? or are they trying to learn data about the victim?

", "time": "2023-07-28T16:49:36Z"}, {"author": "Randy Bush", "text": "

@dg: both are attacks

", "time": "2023-07-28T16:50:10Z"}, {"author": "Alexandr Railean", "text": "

If the victim is present in a set called \"people with disease X\" - they're learning data about the victim, no?

", "time": "2023-07-28T16:50:21Z"}, {"author": "Martin Thomson", "text": "

This is a mathematical reduction. I believe that for this game, they know the data about the victim, but they want to determine if they are present or not.

", "time": "2023-07-28T16:50:27Z"}, {"author": "Daniel Gillmor", "text": "

right, but we're focusing on something specific here

", "time": "2023-07-28T16:50:28Z"}, {"author": "Watson Ladd", "text": "

You want it to go \"a boy has a high fever, pulse 140 bpm, stable BP, sweating, what are next steps\" and answer not spew HIPAA violations

", "time": "2023-07-28T16:50:48Z"}, {"author": "Martin Thomson", "text": "

I think that you can translate that into being able to learn the data if the data was originally unknown.

", "time": "2023-07-28T16:50:50Z"}, {"author": "Daniel Gillmor", "text": "

@Martin Thomson i'm not convinced you can translate that

", "time": "2023-07-28T16:51:12Z"}, {"author": "Martin Thomson", "text": "

I'm not sure either, so it's worth asking.

", "time": "2023-07-28T16:51:25Z"}, {"author": "Kyle Hogan", "text": "

Nick Doty said:

\n
\n

Kyle Hogan said:

\n
\n

Nick Doty said:

\n
\n

is this even considered an attack? it seems like the intended functionality of the large language models that they reflect substance from the training data

\n
\n

The main point of a lot of Nicholas Carlini's work is to challenge the concept that because something is aggregated (e.g. in a large language model) then it is also private. This was kinda obvious to the theorists, but it was _not_ obvious to the people who build these models.

\n
\n

some systems try to provide aggregate data, but I'm not sure large language models are intended to aggregate and hide the individual data. don't the designers want the model to be able to answer questions about specifics in the training set?

\n
\n

I think it really depends on the model, but a lot of them (especially the ones at Google that Carlini breaks) are not supposed to spit out individual training data. Or at least it was argued that training on data from people's phones etc. was okay because only \"common\" things that were thus not personal would be output by the model.

", "time": "2023-07-28T16:52:37Z"}, {"author": "Martin Thomson", "text": "

However, I think that it follows that if the adversary is able to identify a model that includes a specific input, then it might use the same method to discover the difference between models.

", "time": "2023-07-28T16:52:44Z"}, {"author": "Watson Ladd", "text": "

Alexandr Railean said:

\n
\n

If the victim is present in a set called \"people with disease X\" - they're learning data about the victim, no?

\n
\n

Just because the medical paper is about Mrs. Y and people who know she went to the ER for something awkward might learn more from it doesn't mean the paper itself identifies her. In fact they work hard not to while still conveying what's needed to learn from experience.

", "time": "2023-07-28T16:53:03Z"}, {"author": "Martin Thomson", "text": "

Obviously, it depends on the nature of the model output. If the model is like those used in advertising (buy/not), you might not be able to recover input data reliably.

", "time": "2023-07-28T16:53:32Z"}, {"author": "Daniel Gillmor", "text": "

Also, a dataset about disease X must not contain only data about people with disease X. it's critical that it includes data about people without disease X, otherwise you can't reason about the general population.

", "time": "2023-07-28T16:53:46Z"}, {"author": "Daniel Gillmor", "text": "

otherwise you get conclusions like \"92% of the people with disease X drank water in the week before onset of symptoms.\" which is true, but unhelpful

", "time": "2023-07-28T16:54:22Z"}, {"author": "Watson Ladd", "text": "

@Martin Thomson If I'm getting decision output on targeted input I can learn the features through a pretty quick process

", "time": "2023-07-28T16:55:22Z"}, {"author": "Martin Thomson", "text": "

I'd imagine you could, yes

", "time": "2023-07-28T16:55:32Z"}, {"author": "Nick Doty", "text": "

@kyle thanks Kyle, I'll try to read more on the claims from the model developers. model cards that I can quickly find are silent on these privacy issues.

", "time": "2023-07-28T16:55:54Z"}, {"author": "Martin Thomson", "text": "

the question is whether you are able to recover training data

", "time": "2023-07-28T16:55:55Z"}, {"author": "Eric Orth", "text": "

All depends on the situation. E.g., you could have a model for people all with disease X, some of which took medicine Y and some of which took a placebo.

", "time": "2023-07-28T16:56:04Z"}, {"author": "Daniel Gillmor", "text": "

@Eric Orth agreed, but then i'd call that a dataset about medicine Y

", "time": "2023-07-28T16:56:37Z"}, {"author": "Watson Ladd", "text": "

Depends on the model complexity. This sort of attack is more of an issue with big neural nets

", "time": "2023-07-28T16:57:01Z"}, {"author": "Eric Orth", "text": "

Fair enough, but having disease X is still private information to avoid leaking.

", "time": "2023-07-28T16:57:09Z"}, {"author": "Daniel Gillmor", "text": "

but yes you're right that there could be a privacy leak involving just membership in the dataset

", "time": "2023-07-28T16:57:15Z"}, {"author": "Kyle Hogan", "text": "

differential privacy helps a lot with the memorization problem: https://www.usenix.org/conference/usenixsecurity19/presentation/carlini
\nwhich I think it kinda what this section is getting at as well?

", "time": "2023-07-28T17:00:08Z"}, {"author": "Daniel Gillmor", "text": "

but if this model doesn't map to the actual privacy threats people care about then it's not covering \"the privacy risks\", which is what it says on the tin.

", "time": "2023-07-28T17:00:33Z"}, {"author": "Rohan Mahy", "text": "

There is usually not just ONE right thing to measure in a complicated system

", "time": "2023-07-28T17:01:14Z"}, {"author": "Nick Doty", "text": "

knowing whether a user was in the dataset does seem similar to the differential privacy metric

", "time": "2023-07-28T17:01:33Z"}, {"author": "Watson Ladd", "text": "

But there's things we can measure so we call them everything

", "time": "2023-07-28T17:01:39Z"}, {"author": "Daniel Gillmor", "text": "

this does smell a lot like DP

", "time": "2023-07-28T17:01:42Z"}, {"author": "Kyle Hogan", "text": "

Nick Doty said:

\n
\n

@kyle thanks Kyle, I'll try to read more on the claims from the model developers. model cards that I can quickly find are silent on these privacy issues.

\n
\n

Nicholas Carlini's job is telling the model developers \"no stop that\" and he's very friendly and a great speaker. https://nicholas.carlini.com/
\nI really enjoy his talks (or he can usually be found at USENIX Security).

", "time": "2023-07-28T17:02:16Z"}, {"author": "Martin Thomson", "text": "

Yeah, this is DP. Presumably the presence will leak, but with low certainty.

", "time": "2023-07-28T17:02:42Z"}, {"author": "Martin Thomson", "text": "

That is, contributing will cause the model to differ, but an attacker should not be able to distinguish except probalistically. The shape of that determines epsilon.

", "time": "2023-07-28T17:03:30Z"}, {"author": "Daniel Gillmor", "text": "

which again goes back to my earlier point that it assumes that some data is \"my data\" and other data is \"not my data\", which isn't how real-world data works.

", "time": "2023-07-28T17:03:45Z"}, {"author": "Cory Myers", "text": "

(In other words, \u201cquerying\u201d for data d reveals at most trade secrets [let\u2019s call them] about the training set, not anything about d itself\u2026?)

", "time": "2023-07-28T17:03:51Z"}, {"author": "Kyle Hogan", "text": "

Martin Thomson said:

\n
\n

Yeah, this is DP. Presumably the presence will leak, but with low certainty.

\n
\n

I think it actually does one better than that and if DP is used (correctly) in training then you should _not_ get memorization.

", "time": "2023-07-28T17:04:01Z"}, {"author": "Martin Thomson", "text": "

@Kyle Hogan I'm not convinced. DP isn't absolute and neither is this metric.

", "time": "2023-07-28T17:04:44Z"}, {"author": "Nick Doty", "text": "

Daniel Gillmor said:

\n
\n

which again goes back to my earlier point that it assumes that some data is \"my data\" and other data is \"not my data\", which isn't how real-world data works.

\n
\n

it probably depends a lot on the dataset. scraping text from the Web is not easily described as one person's data or another. but you could also train a model on structured records that are about individual people

", "time": "2023-07-28T17:04:55Z"}, {"author": "Martin Thomson", "text": "

Our detection techniques here are probably even less precise than the sorts of analysis that is performed on a symmetric cipher. (You might view the model as a giant sequence of S-boxes or a sponge or whatever.)

", "time": "2023-07-28T17:05:29Z"}, {"author": "Daniel Gillmor", "text": "

@Nick Doty even structured records that are about individual people aren't only about the specific individual people. My dad's medical records is my data even though it's formally \"my dad's data\". my lover's addressbook contains my data even though it's \"their data\"

", "time": "2023-07-28T17:06:05Z"}, {"author": "Watson Ladd", "text": "

I think it's tricky to apply DP to a neural net because of the singular information matrix (this is a technical limitation). Also how do you apply it to pictures or text?

", "time": "2023-07-28T17:06:23Z"}, {"author": "Daniel Gillmor", "text": "

Even my neighbor's data about factors that might be environmentally related are in some way \"my data\", because we share the environmental factors by virtue of being neighbors

", "time": "2023-07-28T17:07:13Z"}, {"author": "Kyle Hogan", "text": "

Martin Thomson said:

\n
\n

Kyle Hogan I'm not convinced. DP isn't absolute and neither is this metric.

\n
\n

check out section 9.3 here (and the associated talk which goes over the social security number example explicitly)
\nhttps://www.usenix.org/system/files/sec19-carlini.pdf

", "time": "2023-07-28T17:07:24Z"}, {"author": "Jonathan Hoyland", "text": "

To me, a more natural metric might be given some fraction of the data d, what percentage of the remainder can I (efficiently) extract?

", "time": "2023-07-28T17:09:00Z"}, {"author": "Daniel Gillmor", "text": "

here we're running into context collapse. if you train an LLM that will be available for general use, you cannot know the context in which the LLM will be used

", "time": "2023-07-28T17:09:03Z"}, {"author": "Kyle Hogan", "text": "

I definitely agree that you could create a training set that is technically differentially private (but with garbage parameters) and end up with a model that is not private by any reasonable human understanding of privacy. But in terms of whether the model is memorizing data is shouldn't (for some definition of \"shouldn't\"), you can use DP effectively.

", "time": "2023-07-28T17:09:17Z"}, {"author": "Daniel Gillmor", "text": "

so by definition you can't know what data will be sensitive.

", "time": "2023-07-28T17:09:23Z"}, {"author": "Martin Thomson", "text": "

@Kyle Hogan that establishes that you can use DP to reduce advantage under this attack model. That is intuitively correct (as above), but it is not an absolute guarantee. Randomized response provides DP, but it does not guarantee that the input data is not left unmodified.

", "time": "2023-07-28T17:09:26Z"}, {"author": "Daniel Gillmor", "text": "

:laughter_tears: \"remove the secret before training\"

", "time": "2023-07-28T17:10:50Z"}, {"author": "Martin Thomson", "text": "

\"remove the useful information before training\"

", "time": "2023-07-28T17:11:00Z"}, {"author": "Nick Doty", "text": "

right, a k-anonymity-style mitigation isn't very useful, because a phrase or piece of data could typically be repeated k times

", "time": "2023-07-28T17:11:09Z"}, {"author": "Alex Chernyakhovsky", "text": "

Data Loss Prevention providers will love that suggestion

", "time": "2023-07-28T17:11:13Z"}, {"author": "Kyle Hogan", "text": "

I mean, this is what DP is.
\nMartin Thomson said:

\n
\n

\"remove the useful information before training\"

\n
", "time": "2023-07-28T17:11:20Z"}, {"author": "Watson Ladd", "text": "

In the paper I have questions about the repeat sampling of the gradient noise. This works in practice but how does it work in theory?

", "time": "2023-07-28T17:11:34Z"}, {"author": "Martin Thomson", "text": "

not quite, but sure

", "time": "2023-07-28T17:11:35Z"}, {"author": "Martin Thomson", "text": "

The DP-SGD paper goes into a lot more detail on that @Watson Ladd

", "time": "2023-07-28T17:12:05Z"}, {"author": "Kyle Hogan", "text": "

Nick Doty said:

\n
\n

right, a k-anonymity-style mitigation isn't very useful, because a phrase or piece of data could typically be repeated k times

\n
\n

Strong agree. theoretical privacy (or anonymity) definitions assume that popular inputs must not be private, but that is really not the case.

", "time": "2023-07-28T17:12:14Z"}, {"author": "Martin Thomson", "text": "

https://arxiv.org/abs/1607.00133

", "time": "2023-07-28T17:12:24Z"}, {"author": "Martin Thomson", "text": "

I don't have a lot of respect for k-anon here (thanks largely to papers that Kyle shared)

", "time": "2023-07-28T17:12:51Z"}, {"author": "Jonathan Hoyland", "text": "

Kyle Hogan said:

\n
\n

Nick Doty said:

\n
\n

right, a k-anonymity-style mitigation isn't very useful, because a phrase or piece of data could typically be repeated k times

\n
\n

Strong agree. theoretical privacy (or anonymity) definitions assume that popular inputs must not be private, but that is really not the case.

\n
\n

Ugh, those humans and their \"real-world\" use cases spoil all my fun.

", "time": "2023-07-28T17:13:20Z"}, {"author": "Daniel Gillmor", "text": "

more humans, more problems

", "time": "2023-07-28T17:13:36Z"}, {"author": "Kyle Hogan", "text": "

Martin Thomson said:

\n
\n

I don't have a lot of respect for k-anon here (thanks largely to papers that Kyle shared)

\n
\n

the paper in question: https://www.usenix.org/conference/usenixsecurity22/presentation/cohen

", "time": "2023-07-28T17:13:49Z"}, {"author": "Martin Thomson", "text": "

My view here is that DP is unlikely to be truly effective in shielding a model from memorization

", "time": "2023-07-28T17:13:51Z"}, {"author": "Daniel Gillmor", "text": "

whether \"books\" are publicly-intended data is a matter of some contention: https://www.theguardian.com/books/2023/jul/05/authors-file-a-lawsuit-against-openai-for-unlawfully-ingesting-their-books

", "time": "2023-07-28T17:15:11Z"}, {"author": "Jonathan Hoyland", "text": "

The ideas that newspapers don't leak private information is ... Well let's just say there have been a number of high profile court cases recently

", "time": "2023-07-28T17:15:28Z"}, {"author": "Watson Ladd", "text": "

And we haven't even started thinking about fine tuning

", "time": "2023-07-28T17:15:34Z"}, {"author": "Antoine Fressancourt", "text": "

Book authors tend to agree

", "time": "2023-07-28T17:15:36Z"}, {"author": "Nick Doty", "text": "

my concern is more whether it's static. I believe some data is intended to be public, but also that might change over time.

", "time": "2023-07-28T17:15:44Z"}, {"author": "David Oran", "text": "

Not all \"public data\" is covered by \"fair use\" - this is an area that obviously will be strongly litigated for quite a while!

", "time": "2023-07-28T17:16:18Z"}, {"author": "Nick Doty", "text": "

yes, thank you, this was a very interesting presentation :praise:

", "time": "2023-07-28T17:16:59Z"}, {"author": "Kris Shrishak", "text": "

Martin Thomson said:

\n
\n

Kyle Hogan I'm not convinced. DP isn't absolute and neither is this metric.

\n
\n

This paper is worth checking: https://arxiv.org/abs/2010.12112v1

", "time": "2023-07-28T17:17:05Z"}, {"author": "Daniel Gillmor", "text": "

thanks, Reza!

", "time": "2023-07-28T17:17:10Z"}, {"author": "Antoine Fressancourt", "text": "

Thanks for the presentation

", "time": "2023-07-28T17:17:11Z"}, {"author": "Nick Doty", "text": "

let's not rely on copyright maximalism to not very successfully protect privacy

", "time": "2023-07-28T17:18:17Z"}, {"author": "Rohan Mahy", "text": "

Here is a concrete use case. An attacker Eve wants to determine if Alice (a woman in Texas) is pregnant and can time querying a LLM to just before and just after a calendar entry in the middle of the day shows that Alice is out-of-office.

", "time": "2023-07-28T17:19:27Z"}, {"author": "Shivan Sahib", "text": "

@Meetecho Robot is there any chance we can increase Arthur's volume in the room?

", "time": "2023-07-28T17:21:04Z"}, {"author": "Lorenzo Miniero", "text": "

We'll send someone over

", "time": "2023-07-28T17:21:41Z"}, {"author": "Daniel Gillmor", "text": "

even when the traffic is encrypted, the destination of the traffic is often not protected (e.g. SNI, or the IP address of the destination service), which itself is a privacy leak

", "time": "2023-07-28T17:22:10Z"}, {"author": "Antoine Fressancourt", "text": "

And the packet pattern leaks info too

", "time": "2023-07-28T17:22:54Z"}, {"author": "Martin Thomson", "text": "

I find this framing somewhat objectionable

", "time": "2023-07-28T17:23:16Z"}, {"author": "Kyle Hogan", "text": "

or, the traffic is encrypted, but not end to end

", "time": "2023-07-28T17:23:17Z"}, {"author": "Nick Doty", "text": "

I'm surprised that functionality isn't included as even one of the reasons that privacy leaks continue

", "time": "2023-07-28T17:23:34Z"}, {"author": "Juliusz Chroboczek", "text": "

Martin, could you please clarify?

", "time": "2023-07-28T17:23:38Z"}, {"author": "Daniel Gillmor", "text": "

@Martin Thomson what's your objection? so far he hasn't said anything i find particularly controversial

", "time": "2023-07-28T17:23:47Z"}, {"author": "Martin Thomson", "text": "

It's not what was said as much as what was implied. I do agree that these are problems and true, but the implication is that all of those entities are the same.

", "time": "2023-07-28T17:24:42Z"}, {"author": "Martin Thomson", "text": "

And that people at these browsers concretely don't care, or invest in making things better.

", "time": "2023-07-28T17:25:09Z"}, {"author": "Martin Thomson", "text": "

Arthur knows that this is not true. And he knows that it is not as simple as that.

", "time": "2023-07-28T17:25:25Z"}, {"author": "Jonathan Hoyland", "text": "

Is there an ethical issue with a Brave employee running this site?

", "time": "2023-07-28T17:25:34Z"}, {"author": "Martin Thomson", "text": "

@Jonathan Hoyland Yes

", "time": "2023-07-28T17:25:41Z"}, {"author": "Daniel Gillmor", "text": "

i understood what he was saying as \"privacy hasn't been prioritized at most browsers above other objectives\" (though i agree with @Nick Doty that it's surprising that he doesn't identify the functionality goals that sometimes outweigh privacy goals)

", "time": "2023-07-28T17:26:47Z"}, {"author": "Martin Thomson", "text": "

To be clear, I think that this is valuable work, but when they are simplified this way, the differences between lines is erased.

", "time": "2023-07-28T17:26:47Z"}, {"author": "Nick Doty", "text": "

Jonathan Hoyland said:

\n
\n

Is there an ethical issue with a Brave employee running this site?

\n
\n

it might not be ideal, but still could be useful. maybe in the long run this sort of web platform testing could be done by a group, rather than a single person with a particular affiliation

", "time": "2023-07-28T17:26:55Z"}, {"author": "Martin Thomson", "text": "

Daniel Gillmor said:

\n
\n

i understood what he was saying as \"privacy hasn't been prioritized at most browsers above other objectives\" (though i agree with Nick Doty that it's surprising that he doesn't identify the functionality goals that sometimes outweigh privacy goals)

\n
\n

Yes, this is an over-simplification.

", "time": "2023-07-28T17:27:18Z"}, {"author": "Shivan Sahib", "text": "

fwiw we've used Arthur's tool to prioritize privacy work internally at Brave, not the other way around

", "time": "2023-07-28T17:27:54Z"}, {"author": "Martin Thomson", "text": "

Concretely, there are only so many people who can work on this stuff and only so many changes you can make due to other constraints (like not breaking a bunch of sites)

", "time": "2023-07-28T17:28:00Z"}, {"author": "Martin Thomson", "text": "

@Shivan Sahib That sounds right.

", "time": "2023-07-28T17:28:14Z"}, {"author": "Daniel Gillmor", "text": "

the whole project, like any scorecard is a deliberate over-simplification.
\nthat doesn't make scorecards inherently wrong.

", "time": "2023-07-28T17:28:20Z"}, {"author": "Martin Thomson", "text": "

No, but the interpretation of that information is key.

", "time": "2023-07-28T17:28:37Z"}, {"author": "Jonathan Hoyland", "text": "

Can we run these tests ourselves? My browser is highly customised / modified.

", "time": "2023-07-28T17:28:53Z"}, {"author": "Kyle Hogan", "text": "

Jonathan Hoyland said:

\n
\n

Can we run these tests ourselves? My browser is highly customised / modified.

\n
\n

you're about to fail a lot of them then :)

", "time": "2023-07-28T17:29:14Z"}, {"author": "Daniel Gillmor", "text": "

@Jonathan Hoyland then servers can probably identify you from your customizations in the first place :stuck_out_tongue:

", "time": "2023-07-28T17:29:20Z"}, {"author": "Juliusz Chroboczek", "text": "

Brave routinely breaks websites, so it's not obvious that breaking sites on occasion is not acceptable. (I've found them to be reasonably responsive when I reported a bug, though.)

", "time": "2023-07-28T17:29:25Z"}, {"author": "Jonathan Hoyland", "text": "

They totally can

", "time": "2023-07-28T17:29:27Z"}, {"author": "Martin Thomson", "text": "

I think that you can run the tests, the code is open.

", "time": "2023-07-28T17:29:32Z"}, {"author": "Shivan Sahib", "text": "

https://github.com/privacytests/privacytests.org

", "time": "2023-07-28T17:29:35Z"}, {"author": "Daniel Gillmor", "text": "

@Jonathan Hoyland mine too, fwiw :grimacing:

", "time": "2023-07-28T17:29:52Z"}, {"author": "Martin Thomson", "text": "

Contributing to fingerprinting entropy @Jonathan Hoyland and @Daniel Gillmor I see

", "time": "2023-07-28T17:30:30Z"}, {"author": "Kyle Hogan", "text": "

could be nice to make another attempt at pulling the tor browser folks in

", "time": "2023-07-28T17:30:42Z"}, {"author": "Jonathan Hoyland", "text": "

I tried the browser fingerprinting tests once and I stick out like a sore thumb.

", "time": "2023-07-28T17:30:52Z"}, {"author": "Martin Thomson", "text": "

Most of the fingerprinting tests are badly flawed though. They are bad at picking up correlated signals.

", "time": "2023-07-28T17:31:20Z"}, {"author": "Stephen Farrell", "text": "

programming note: I just noticed that half the maprg talks could fit pearg too so the conflict is quite the pity

", "time": "2023-07-28T17:31:32Z"}, {"author": "Sara Dickinson", "text": "

+1

", "time": "2023-07-28T17:32:33Z"}, {"author": "Luigi Iannone", "text": "

+1

", "time": "2023-07-28T17:32:38Z"}, {"author": "Jonathan Hoyland", "text": "

We need a final column \"% of websites that break in this browser\" :joy:

", "time": "2023-07-28T17:32:44Z"}, {"author": "Fernando Gont", "text": "

How long till browsers employ one IP address per, say, tab?

", "time": "2023-07-28T17:33:00Z"}, {"author": "Martin Thomson", "text": "

@Fernando Gont ...

", "time": "2023-07-28T17:33:06Z"}, {"author": "Martin Thomson", "text": "

You'll probably need to pay for that though

", "time": "2023-07-28T17:33:21Z"}, {"author": "Alexander Clouter", "text": "

the ipv6ops have DHCPv6-PD per device...so maybe not long

", "time": "2023-07-28T17:33:25Z"}, {"author": "Juliusz Chroboczek", "text": "

@Martin unless you use tor.

", "time": "2023-07-28T17:33:33Z"}, {"author": "Antoine Fressancourt", "text": "

You can get up to 2^64 tabs running !

", "time": "2023-07-28T17:34:05Z"}, {"author": "Fernando Gont", "text": "

You don;t need that. Typically you can configure any address in the /64 -- that's how you do temporary addresses, for instance (rfc8941)

", "time": "2023-07-28T17:34:06Z"}, {"author": "Watson Ladd", "text": "

The local assigned part rotates on machines with a privacy extension right?

", "time": "2023-07-28T17:34:06Z"}, {"author": "Daniel Gillmor", "text": "

when you use tor, you pay with your time. (said as a happy long-time tor user)

", "time": "2023-07-28T17:34:06Z"}, {"author": "Luigi Iannone", "text": "

DHCPv6-PD per device is not privacy. Your ID is now your prefix..

", "time": "2023-07-28T17:34:07Z"}, {"author": "Alexander Clouter", "text": "

@Luigi...it was a joke

", "time": "2023-07-28T17:34:23Z"}, {"author": "Martin Thomson", "text": "

@Juliusz Chroboczek we did an analysis a few years back and including Tor support would have added too much load to the Tor network. We couldn't responsibly have done that.

", "time": "2023-07-28T17:34:25Z"}, {"author": "Luigi Iannone", "text": "

:-)

", "time": "2023-07-28T17:34:32Z"}, {"author": "Juliusz Chroboczek", "text": "

Martin, who's \"we\"?

", "time": "2023-07-28T17:34:48Z"}, {"author": "Jonathan Hoyland", "text": "

Fernando Gont said:

\n
\n

How long till browsers employ one IP address per, say, tab?

\n
\n

I bet you the moment that happens people will merge them with CGNAT :joy:

", "time": "2023-07-28T17:34:52Z"}, {"author": "Martin Thomson", "text": "

@Fernando Gont common prefix means no useful privacy. Also, some people don't have v6

", "time": "2023-07-28T17:34:55Z"}, {"author": "Martin Thomson", "text": "

@Juliusz Chroboczek Firefox

", "time": "2023-07-28T17:35:03Z"}, {"author": "Antoine Fressancourt", "text": "

@Juliusz Mozilla, I guess

", "time": "2023-07-28T17:35:08Z"}, {"author": "Fernando Gont", "text": "

If you had multiple addresses per browser, and multiple users on the link, that might help. -- definitely far from perfect, though, of course: https://datatracker.ietf.org/doc/html/draft-ietf-opsec-ipv6-addressing-00

", "time": "2023-07-28T17:35:20Z"}, {"author": "Watson Ladd", "text": "

I did that analysis as part of AdScale at CCS16. Even for a tiny fraction of requests it would go down. Read the paper to see almost decade old numbers

", "time": "2023-07-28T17:35:38Z"}, {"author": "Martin Thomson", "text": "

This query parameter thing is pure theatrics

", "time": "2023-07-28T17:35:45Z"}, {"author": "Juliusz Chroboczek", "text": "

@Martin, @Daniel I agree with both of you, though, Tor is not a solution in general.

", "time": "2023-07-28T17:35:48Z"}, {"author": "Jonathan Hoyland", "text": "

Aren't Meta moving to integrity protected URLs?

", "time": "2023-07-28T17:36:09Z"}, {"author": "Martin Thomson", "text": "

@Jonathan Hoyland that only works on properties they control, I believe.

", "time": "2023-07-28T17:36:26Z"}, {"author": "Fernando Gont", "text": "

@Martin: definitely not perfect. But the assumption is that if you ahve multiple users on the link, at least you can only be tracked at a coarser granularity. -- e.g., home-granularity vs. user granularity

", "time": "2023-07-28T17:36:42Z"}, {"author": "Watson Ladd", "text": "

And tracking content: some of that is for \"is the site up and showing things\"

", "time": "2023-07-28T17:36:55Z"}, {"author": "Martin Thomson", "text": "

For the most part, outbound links can't be integrity protected that way because you need to get sites to agree on a scheme.

", "time": "2023-07-28T17:36:56Z"}, {"author": "Daniel Gillmor", "text": "

@Watson Ladd link for CCS16 paper?

", "time": "2023-07-28T17:37:03Z"}, {"author": "Juliusz Chroboczek", "text": "

You'd need to disable IPv4, otherwise the site can still embed an IPv4-only web bug.

", "time": "2023-07-28T17:37:26Z"}, {"author": "Martin Thomson", "text": "

@Watson Ladd Yeah, there are a lot of stuff out there that can look like tracking, but isn't in practice. These tools sometimes over-index.

", "time": "2023-07-28T17:37:31Z"}, {"author": "Jonathan Hoyland", "text": "

@Martin Thomson Yeah, it was for outbound links. I now can't go to the newspaper article without binding that to my Facebook profile

", "time": "2023-07-28T17:37:37Z"}, {"author": "Fernando Gont", "text": "

@martin: either NAT for IPv6, or use more addresses. Otherwise, privacy-wise things get worse

", "time": "2023-07-28T17:37:46Z"}, {"author": "Watson Ladd", "text": "

https://isi.jhu.edu/~mgreen/advertising.pdf

", "time": "2023-07-28T17:38:05Z"}, {"author": "Jonathan Hoyland", "text": "

They hide the real URL behind their \"link protection\"

", "time": "2023-07-28T17:38:07Z"}, {"author": "Martin Thomson", "text": "

@Fernando Gont yeah, I would really love to have a NAT that mixed my traffic with more people, but network operators don't want to do that for me.

", "time": "2023-07-28T17:38:20Z"}, {"author": "Martin Thomson", "text": "

We asked. It's expensive.

", "time": "2023-07-28T17:38:29Z"}, {"author": "Daniel Gillmor", "text": "

I get those kinds of query parameters all the time when friends share links. They bind me to their FB or Google profile that they were using when they sent me the link

", "time": "2023-07-28T17:38:29Z"}, {"author": "Martin Thomson", "text": "

@Daniel Gillmor sometimes yes, sometimes no.

", "time": "2023-07-28T17:38:45Z"}, {"author": "Martin Thomson", "text": "

The semantics of the query parameters are really unclear.

", "time": "2023-07-28T17:38:58Z"}, {"author": "Shivan Sahib", "text": "

Martin Thomson said:

\n
\n

This query parameter thing is pure theatrics

\n
\n

That's a bit hyperbolic

", "time": "2023-07-28T17:39:07Z"}, {"author": "Reza Shokri", "text": "

Thanks all for the comments and questions on the privacy in LLMs. Membership inference attacks and their games are explained here: https://arxiv.org/pdf/2111.09679.pdf. Indeed DP bounds the success of membership inference attacks. The metrics are the same. DP puts a a lower bound on the total error that adversary can make (hence the privacy guarantee). However, as discussed in the end, by making an algorithm DP (resistant to membership inference attacks), we assume that my records contain all my sensitive data, which is not always true (especially for language models).

", "time": "2023-07-28T17:39:16Z"}, {"author": "Martin Thomson", "text": "

gbraid for instance has no documentation and I've not been able to get any details from Google.

", "time": "2023-07-28T17:39:17Z"}, {"author": "Watson Ladd", "text": "

@Martin Thomson what if we did it statelessly?

", "time": "2023-07-28T17:40:00Z"}, {"author": "Kyle Hogan", "text": "

Martin Thomson said:

\n
\n

The semantics of the query parameters are really unclear.

\n
\n

for the ignorant -- what's the complaint here?

", "time": "2023-07-28T17:40:09Z"}, {"author": "Jonathan Hoyland", "text": "

@Shivan Sahib @Martin Thomson nice to see the browsers competing on privacy :wink:

", "time": "2023-07-28T17:40:25Z"}, {"author": "Martin Thomson", "text": "

Shivan Sahib said:

\n
\n

Martin Thomson said:

\n
\n

This query parameter thing is pure theatrics

\n
\n

That's a bit hyperbolic

\n
\n

OK, let me sharpen that a little. It's a useful bit of theatrics - we're doing it. But it is trivial to link actions across navigations, even without query parameters.

", "time": "2023-07-28T17:40:39Z"}, {"author": "Martin Thomson", "text": "

Kyle Hogan said:

\n
\n

Martin Thomson said:

\n
\n

The semantics of the query parameters are really unclear.

\n
\n

for the ignorant -- what's the complaint here?

\n
\n

A query parameter passes some information from one site to another. But we don't know what that information is. for instance, UTM parameters often pass very general information.

", "time": "2023-07-28T17:41:35Z"}, {"author": "Shivan Sahib", "text": "

Sorry folks, had to lock the queue, we're already running a bit behind

", "time": "2023-07-28T17:41:52Z"}, {"author": "Juliusz Chroboczek", "text": "

What does UTM mean?

", "time": "2023-07-28T17:41:54Z"}, {"author": "Daniel Gillmor", "text": "

qparams appear to me to be a low integrity form of cross-platform tracking that work based on loose coordination between origins. active disruption seems like a bit of a cat and mouse game, but i think those games are worthwhile. make them play catch-up!

", "time": "2023-07-28T17:41:58Z"}, {"author": "Martin Thomson", "text": "

https://en.wikipedia.org/wiki/UTM_parameters

", "time": "2023-07-28T17:42:02Z"}, {"author": "Martin Thomson", "text": "

@Daniel Gillmor that's where we are at right now, yes

", "time": "2023-07-28T17:42:15Z"}, {"author": "Juliusz Chroboczek", "text": "

ty

", "time": "2023-07-28T17:42:19Z"}, {"author": "Jonathan Hoyland", "text": "

Daniel Gillmor said:

\n
\n

qparams appear to me to be a low integrity form of cross-platform tracking that work based on loose coordination between origins. active disruption seems like a bit of a cat and mouse game, but i think those games are worthwhile. make them play catch-up!

\n
\n

As long as they don't just use integrity protection and just win

", "time": "2023-07-28T17:43:15Z"}, {"author": "Martin Thomson", "text": "

This query parameter thing is one of the points I was complaining about before. Those 30 parameters are of very little real value, relative to something like cross-site cookies.

", "time": "2023-07-28T17:43:17Z"}, {"author": "Juliusz Chroboczek", "text": "

What's \"oblivous DNS\"?

", "time": "2023-07-28T17:43:18Z"}, {"author": "Shivan Sahib", "text": "

+1 Tommy

", "time": "2023-07-28T17:43:26Z"}, {"author": "Daniel Gillmor", "text": "

and agreed that most utm params i've seen are low-entropy enough that i can't see them being used for individual tracking
\ni do think they're useful for ascertaining overall patterns, which can in turn be used for some sort of invasive inference, though.

", "time": "2023-07-28T17:43:31Z"}, {"author": "Martin Thomson", "text": "

Juliusz Chroboczek said:

\n
\n

What's \"oblivous DNS\"?

\n
\n

https://datatracker.ietf.org/doc/rfc9230/

", "time": "2023-07-28T17:43:45Z"}, {"author": "Martin Thomson", "text": "

Though the new approach is OHTTP + DoH or DoOH

", "time": "2023-07-28T17:43:58Z"}, {"author": "Juliusz Chroboczek", "text": "

ty

", "time": "2023-07-28T17:44:10Z"}, {"author": "Watson Ladd", "text": "

The inference is the point. \"Gee the football ads do better than the basketball ads for the bar, we've got to switch them up\"

", "time": "2023-07-28T17:44:27Z"}, {"author": "Martin Thomson", "text": "

@Daniel Gillmor UTM is generally OK (though some of the newer ones aren't). But they can be used for tracking even then.

", "time": "2023-07-28T17:44:48Z"}, {"author": "Jonathan Hoyland", "text": "

Not a big fan of the name DoOH because the pun relies on an American accent

", "time": "2023-07-28T17:44:57Z"}, {"author": "Daniel Gillmor", "text": "

@Watson Ladd right, and that in turn yields conclusions like \"football fans are more likely to respond to alcohol advertisements\"

", "time": "2023-07-28T17:45:15Z"}, {"author": "Martin Thomson", "text": "

See https://blog.mozilla.org/en/mozilla/understanding-apples-private-click-measurement/ for how an example of low entropy signals can be used for tracking.

", "time": "2023-07-28T17:45:16Z"}, {"author": "Thom Wiggers", "text": "

there is loud buzzing on Iv\u00e1n's sound

", "time": "2023-07-28T17:48:01Z"}, {"author": "Juliusz Chroboczek", "text": "

50Hz.

", "time": "2023-07-28T17:48:18Z"}, {"author": "Thom Wiggers", "text": "

better now

", "time": "2023-07-28T17:48:21Z"}, {"author": "Nick Doty", "text": "

agree with Arthur's last point there that state partitioning is a good example of how browsers can make progress! although even there I think maintaining existing functionality is one of the primary sticking points

", "time": "2023-07-28T17:49:26Z"}, {"author": "Nick Doty", "text": "

state within a site and state across sites are both very useful, and widely abused, and we didn't design it to easily distinguish the use cases and abuse cases

", "time": "2023-07-28T17:51:18Z"}, {"author": "Jonathan Hoyland", "text": "

The root cause is people don't know how to design channel bindings \ud83e\udee0

", "time": "2023-07-28T17:55:00Z"}, {"author": "Shivan Sahib", "text": "

RFC 9416 is a BCP: https://www.ietf.org/rfc/rfc9416.html

", "time": "2023-07-28T18:05:36Z"}, {"author": "Nick Doty", "text": "

one threat we see is when identifiers are rotated on different schedules, then an attacker may be able to re-connect identifiers even after they're changed (even if they're changed with an unpredictable algorithm)

", "time": "2023-07-28T18:05:48Z"}, {"author": "Ayoub MESSOUS", "text": "

@Arthur Edelstein I found your talk very interesting and important not only for attracting the attention of technical people actually working on this topic but also for raising awareness of the wide public. I had 2 questions:

\n
    \n
  1. From your experience working on this topic in Tor and Mozilla, what are the nature of challenges that stop technical teams of actually solving the privacy issues that you are testing. Is it technically challenging or more related to not wanting to break already build in features or business models?
  2. \n
  3. What is the browser that you are currently/frequently using (for your personal use)?
  4. \n
", "time": "2023-07-28T18:06:32Z"}, {"author": "Iv\u00e1n Arce", "text": "

indeed, there are unspoken-of properties of \"ids\"

", "time": "2023-07-28T18:06:54Z"}, {"author": "Daniel Gillmor", "text": "

IDs are persistent across some period of time -- otherwise, it's a \"nonce\"

", "time": "2023-07-28T18:07:21Z"}, {"author": "Iv\u00e1n Arce", "text": "

for example a unique ID need not necessarily be unpredictable.

", "time": "2023-07-28T18:07:24Z"}, {"author": "Daniel Gillmor", "text": "

@Nick Doty is pointing out that the overlaps of the lifetimes of IDs also needs attention

", "time": "2023-07-28T18:07:46Z"}, {"author": "Iv\u00e1n Arce", "text": "

so for interoperabiltiy, a global counter is enough for uniqueness, but it could be devastating for security or privacy

", "time": "2023-07-28T18:07:54Z"}, {"author": "Stephen Farrell", "text": "

Shivan Sahib said:

\n
\n

RFC 9416 is a BCP: https://www.ietf.org/rfc/rfc9416.html

\n
\n

Just noticed that https://datatracker.ietf.org/doc/bcp72/ redirects to rfc3552 - I think the old tools page used include all the RFCs for each BCP which'd maybe be a little better

", "time": "2023-07-28T18:08:16Z"}, {"author": "Iv\u00e1n Arce", "text": "

similarily, a unique Id doesnt need any extra \"semantics\", but if you add additional meaning to it, the risk of linkability or tracking or generally infoleak increases

", "time": "2023-07-28T18:08:43Z"}, {"author": "Daniel Gillmor", "text": "

https://www.ietf.org/info/bcp72 (linked actively in https://www.ietf.org/rfc/rfc9416.html) is itself 404

", "time": "2023-07-28T18:08:56Z"}, {"author": "Watson Ladd", "text": "

Stephen Farrell said:

\n
\n

Shivan Sahib said:

\n
\n

RFC 9416 is a BCP: https://www.ietf.org/rfc/rfc9416.html

\n
\n

Just noticed that https://datatracker.ietf.org/doc/bcp72/ redirects to rfc3552 - I think the old tools page used include all the RFCs for each BCP which'd maybe be a little better

\n
\n

Wait a BCP is not a single RFC? :exploding_head:

", "time": "2023-07-28T18:09:12Z"}, {"author": "Arthur Edelstein", "text": "

Hi everyone -- thanks for the great feedback! I want to say I completely agree with Martin that many people at browsers care very much about privacy. (I didn't mean to imply otherwise.) You can find such people at every browser organization and Martin is an outstanding example.

", "time": "2023-07-28T18:09:26Z"}, {"author": "Stephen Farrell", "text": "

@Watson Ladd yep, think some of the process ones include >5 rfcs (forget numbers)

", "time": "2023-07-28T18:10:25Z"}, {"author": "Daniel Gillmor", "text": "

:+1: thanks for this work, @Fernando Gont and @Iv\u00e1n Arce -- this is valuable guidance

", "time": "2023-07-28T18:10:26Z"}, {"author": "Sara Dickinson", "text": "

+1 to that - it was a herculean task to pull this all together

", "time": "2023-07-28T18:10:58Z"}, {"author": "Fernando Gont", "text": "

@watson: i learned that BCPs can be more than one document as a result of this work :-)

", "time": "2023-07-28T18:11:19Z"}, {"author": "Fernando Gont", "text": "

@Daniel: thanks for your support during this effort, btw!

", "time": "2023-07-28T18:11:40Z"}, {"author": "Nick Doty", "text": "

Social Security Numbers in the US used to include some geographic indicators (the first three numbers) and sequentially increasing numbers (the last 4), which had exactly the problems that these drafts note! but I learned that that has changed now, so you can't so easily guess my new son's SSN

", "time": "2023-07-28T18:14:20Z"}, {"author": "Arthur Edelstein", "text": "

@Ayoub MESSOUS Tor and Mozilla faced different problems. Tor was generally struggling with a small team in trying to keep up with all the privacy protections that needed to be added and maintained (the team has since grown). Mozilla had more concerns around web compatibility and business model.

\n

As I mentioned, I refrain from recommending a browser to anyone. Personally I mostly use Brave but that's because I work on Brave in my day job and I am dogfooding.

", "time": "2023-07-28T18:15:03Z"}, {"author": "Nick Doty", "text": "

first RFCs! :tada: congrats

", "time": "2023-07-28T18:15:45Z"}, {"author": "Daniel Gillmor", "text": "

QUIC connection IDs are a great example of the conflicts here. It would be great if they were random (or absent) but many people want to add structure and semantics to them :/

", "time": "2023-07-28T18:15:47Z"}, {"author": "Daniel Gillmor", "text": "

@Jonathan Hoyland channel bindings work presuming you already have a cryptographic channel set up and you just need an arbitrary identifier

", "time": "2023-07-28T18:16:08Z"}, {"author": "Juliusz Chroboczek", "text": "

@Meetecho, please pan the camera.

", "time": "2023-07-28T18:16:20Z"}, {"author": "Daniel Gillmor", "text": "

the stuff this work covers includes many cases where you don't already have a cryptographic channel to extract a binding from.

", "time": "2023-07-28T18:16:34Z"}, {"author": "Juliusz Chroboczek", "text": "

ty

", "time": "2023-07-28T18:16:36Z"}, {"author": "Fernando Gont", "text": "

@Nick: thanks for the input: I wasn't aware about the patterns in US SSNs..

", "time": "2023-07-28T18:17:06Z"}, {"author": "Jonathan Hoyland", "text": "

Daniel Gillmor said:

\n
\n

Jonathan Hoyland channel bindings work presuming you already have a cryptographic channel set up and you just need an arbitrary identifier

\n
\n

Correct, and they don't guarantee privacy at all, but I'd say they're clearly related here.

", "time": "2023-07-28T18:17:08Z"}, {"author": "Iv\u00e1n Arce", "text": "

Just a comment re: channel binding. sometimes uniqueness is not sufficient, you also need your IDs to not be predictable to an off-path attacker, or to an attacker that can learn state of the id generator

", "time": "2023-07-28T18:17:35Z"}, {"author": "Watson Ladd", "text": "

Daniel Gillmor said:

\n
\n

QUIC connection IDs are a great example of the conflicts here. It would be great if they were random (or absent) but many people want to add structure and semantics to them :/

\n
\n

Any particular proposal in mind? I think most I've seen keep them fairly random

", "time": "2023-07-28T18:17:47Z"}, {"author": "Jonathan Hoyland", "text": "

Saying \"you can't just crypto\" seems very ... unsupported

", "time": "2023-07-28T18:17:56Z"}, {"author": "Martin Thomson", "text": "

@Watson Ladd QUIC-LB: https://quicwg.org/load-balancers/draft-ietf-quic-load-balancers.html

", "time": "2023-07-28T18:18:27Z"}, {"author": "Iv\u00e1n Arce", "text": "

and sometimes unpredictable IDs arent enough either, you also need them to not be \"collisionable\". Ie the DNS query id may be randomly generated but its still just 16 bits so a futile attempt

", "time": "2023-07-28T18:18:33Z"}, {"author": "Daniel Gillmor", "text": "

@Watson Ladd i think every quic WG session i've been in includes someone who wants to inject structure to do things like load balancing

", "time": "2023-07-28T18:18:34Z"}, {"author": "Fernando Gont", "text": "

We don't say that. We say \"Just because you're using an encrypted channel doesn't mean you can be sloppy about transient numeric identifiers\"

", "time": "2023-07-28T18:18:43Z"}, {"author": "Martin Thomson", "text": "

The idea that you can specify something like this is a little more optimistic than realistic.

", "time": "2023-07-28T18:19:04Z"}, {"author": "Jonathan Hoyland", "text": "

There is crypto designed for exactly this. Yes there is crypto that does other stuff, and yes you can use crypto wrong, but it does exist.

", "time": "2023-07-28T18:19:06Z"}, {"author": "Watson Ladd", "text": "

In a SYN packet?

", "time": "2023-07-28T18:19:25Z"}, {"author": "Martin Thomson", "text": "

There is a requirement, but also trade-offs. Sometimes, people make bad trade-offs, which I think is @Fernando Gont's point, but it's still a real thing.

", "time": "2023-07-28T18:19:35Z"}, {"author": "Daniel Gillmor", "text": "

@Martin Thomson this is guidance to protocol developers. I think you're saying \"protocol developers don't have to read or follow guidance\" and \"protocol reviewers don't have to read or follow guidance\". that's surely true in , but it's also not pointless to provide the guidance.

", "time": "2023-07-28T18:20:10Z"}, {"author": "Fernando Gont", "text": "

@daniel: Indeed, the only pushback we got was from QUIC \"advocates\". exercise: go look at the QUIC spec, and tell me what are the interoperability requirements for their IDs. -- definitely not straightforward... but it should!

", "time": "2023-07-28T18:20:13Z"}, {"author": "Watson Ladd", "text": "

Daniel Gillmor said:

\n
\n

Watson Ladd i think every quic WG session i've been in includes someone who wants to inject structure to do things like load balancing

\n
\n

Ah yeah that was the one I could think of. I think it's better than everyone doing their own thing and the structuring is mild

", "time": "2023-07-28T18:20:18Z"}, {"author": "Daniel Gillmor", "text": "

people have proposed binding specific backing websites into the connection ID, or persistent client identifiers :/

", "time": "2023-07-28T18:21:23Z"}, {"author": "Rohan Mahy", "text": "

what is \"non-identifying\" internet traffic?

", "time": "2023-07-28T18:21:49Z"}, {"author": "Fernando Gont", "text": "

@Martin: the fact that we have specs that employ transient numeric IDs that don;t clearly specify their interoperability requirement should ring a bell. Datapoint: the recent TCP revision got that wrong. And QUIC is not clear (iirc) about the interoperability requirements of e.g. their connection ids

", "time": "2023-07-28T18:22:16Z"}, {"author": "Juliusz Chroboczek", "text": "

Erreur 404
\nLa page n'existe pas ou n'est pas disponible

", "time": "2023-07-28T18:22:31Z"}, {"author": "Martin Thomson", "text": "

Daniel Gillmor said:

\n
\n

Martin Thomson this is guidance to protocol developers. I think you're saying \"protocol developers don't have to read or follow guidance\" and \"protocol reviewers don't have to read or follow guidance\". that's surely true in , but it's also not pointless to provide the guidance.

\n
\n

I read the drafts and the comments during presentation as being a little more absolute than that.

", "time": "2023-07-28T18:22:33Z"}, {"author": "Martin Thomson", "text": "

@Fernando Gont I think we disagree.

", "time": "2023-07-28T18:22:48Z"}, {"author": "Martin Thomson", "text": "

The interoperability requirements are clear. The privacy requirements (unpredictability) are clear.

", "time": "2023-07-28T18:23:11Z"}, {"author": "Fernando Gont", "text": "

@martin: what, specifically, do you disagree about?

", "time": "2023-07-28T18:23:11Z"}, {"author": "Martin Thomson", "text": "

Also, unlinkability.

", "time": "2023-07-28T18:23:29Z"}, {"author": "Iv\u00e1n Arce", "text": "

We dont expect that every protocol author will do an assessment of how they use numeric IDs, how their spec mandates their generation and what impact that has on security and privacy, BUT at least there will be some guidance for those interested in doing it

", "time": "2023-07-28T18:23:51Z"}, {"author": "Juliusz Chroboczek", "text": "

mic: the link to the proposed \"Digital Bill\" is dead (404). Could we please have the full French name of the bill?

", "time": "2023-07-28T18:23:52Z"}, {"author": "Martin Thomson", "text": "

@Fernando Gont :

", "time": "2023-07-28T18:24:11Z"}, {"author": "Jonathan Hoyland", "text": "

To websites organising protests, for example

", "time": "2023-07-28T18:24:16Z"}, {"author": "Martin Thomson", "text": "
\n

Connection IDs MUST NOT contain any information that can be used by an external observer (that is, one that does not cooperate with the issuer) to correlate them with other connection IDs for the same connection.

\n
", "time": "2023-07-28T18:24:19Z"}, {"author": "Fernando Gont", "text": "

@martin: can you provide a pointer to the specification of the interoperability properties of the conenction ids? IIRC, when skimming through the spec, i inferred that the implicit requirement was uniqueness -- yet they were selected incrementally (from a counter).

", "time": "2023-07-28T18:24:35Z"}, {"author": "Martin Thomson", "text": "

That is https://quicwg.org/base-drafts/rfc9000.html#section-5.1-4

", "time": "2023-07-28T18:24:48Z"}, {"author": "Martin Thomson", "text": "

That's an indistinguishability property.

", "time": "2023-07-28T18:25:04Z"}, {"author": "Iv\u00e1n Arce", "text": "

A couple of years ago when we were working the RFC I looked into QUIC connection IDs but being a total newbie to the protocol, the learning curve (rfc is 180 pages) to understand how they may impact S&P was too steep

", "time": "2023-07-28T18:25:20Z"}, {"author": "Daniel Gillmor", "text": "

@Martin Thomson this doesn't say anything about correlating them with other connection IDs for different connections.

", "time": "2023-07-28T18:25:21Z"}, {"author": "Martin Thomson", "text": "

@Daniel Gillmor true, but I think we have that elsewhere. Let me see.

", "time": "2023-07-28T18:25:42Z"}, {"author": "Fernando Gont", "text": "

@Daniel @martin: exactly: what about connection-ids employed for connections with the same entity?

", "time": "2023-07-28T18:26:11Z"}, {"author": "Juliusz Chroboczek", "text": "

The Legifrance link for the Digital Bill is dead.

", "time": "2023-07-28T18:26:44Z"}, {"author": "Martin Thomson", "text": "

@Fernando Gont that's a fair point. I can't see anything in there.

", "time": "2023-07-28T18:26:57Z"}, {"author": "Martin Thomson", "text": "

If you have comments on the QUIC-LB draft, that would be useful.

", "time": "2023-07-28T18:27:07Z"}, {"author": "Fernando Gont", "text": "

@martin: will do!

", "time": "2023-07-28T18:27:56Z"}, {"author": "Antoine Fressancourt", "text": "

A little of French context, those days if you need something to be heard by the French government, your best option is that a policeman union is relaying your message...

", "time": "2023-07-28T18:28:14Z"}, {"author": "Carsten Bormann", "text": "

Oops. The police unions are not bound to laws of nature.

", "time": "2023-07-28T18:28:55Z"}, {"author": "Randy Bush", "text": "

@antoine: that is very telling

", "time": "2023-07-28T18:29:03Z"}, {"author": "Antoine Fressancourt", "text": "

@Randy Indeed, and this is depressing

", "time": "2023-07-28T18:29:33Z"}, {"author": "Juliusz Chroboczek", "text": "

Alliance, the main policemen union, is arguing for the right to anonymity for policemen.

", "time": "2023-07-28T18:29:57Z"}, {"author": "Peter Koch", "text": "

don't worry, Quad9 actually did get a court order ;-)

", "time": "2023-07-28T18:30:47Z"}, {"author": "Jim Reid", "text": "

I thought if you wanted the French government to listen, you rioted in the streets. :-)

", "time": "2023-07-28T18:30:47Z"}, {"author": "Nick Doty", "text": "

a court order is an (imperfect) mechanism for judicial review and due process

", "time": "2023-07-28T18:31:23Z"}, {"author": "Juliusz Chroboczek", "text": "

We wish. We demonstrated for months against the pensions reform, the government decided to use emergy powers to pass the law.

", "time": "2023-07-28T18:31:43Z"}, {"author": "Daniel Gillmor", "text": "

Jim said \"listen\" he didn't say \"respond positively to\"

", "time": "2023-07-28T18:32:04Z"}, {"author": "Juliusz Chroboczek", "text": "

Point taken.

", "time": "2023-07-28T18:32:20Z"}, {"author": "Daniel Gillmor", "text": "

seems like they heard, and said \"damn, we need to ram this through quickly\"

", "time": "2023-07-28T18:32:25Z"}, {"author": "Florence D", "text": "

Can someone give some more context on what the concerns about age verification on the internet are? Assuming it can be done in a privacy preserving way beyond revealing \"is over/under age x\".

", "time": "2023-07-28T18:32:29Z"}, {"author": "Jim Reid", "text": "

If court orders are imperfect, what's the better alternative?

", "time": "2023-07-28T18:32:31Z"}, {"author": "Watson Ladd", "text": "

Here in SF the 1975 police strike was quite ugly. It's not just a problem in France

", "time": "2023-07-28T18:32:36Z"}, {"author": "Antoine Fressancourt", "text": "

@Jim improve the way courts work ?

", "time": "2023-07-28T18:33:07Z"}, {"author": "Juliusz Chroboczek", "text": "

Florence, think about the government wanting to block (for some reason) a support group for minors.

", "time": "2023-07-28T18:33:08Z"}, {"author": "Watson Ladd", "text": "

Florence D said:

\n
\n

Can someone give some more context on what the concerns about age verification on the internet are? Assuming it can be done in a privacy preserving way beyond revealing \"is over/under age x\".

\n
\n

That's the big problem. Scanning ids is what people are doing.

", "time": "2023-07-28T18:33:12Z"}, {"author": "Stephen Farrell", "text": "

how can age verification be done in a privacy preserving way>

", "time": "2023-07-28T18:33:19Z"}, {"author": "Peter Koch", "text": "

'court order' involves different means in different jurisdictions and sometimes does not include a hearing

", "time": "2023-07-28T18:33:21Z"}, {"author": "Daniel Gillmor", "text": "

@Florence D your assumption is a remarkably strong one. do we have a demonstration that this is actually being done with that kind of privacy preservation?

", "time": "2023-07-28T18:33:50Z"}, {"author": "Juliusz Chroboczek", "text": "

Some background: since 1983 (directive Mauroy), the French police have the right to check your personal belongings and your dwelling without a court order.

", "time": "2023-07-28T18:34:14Z"}, {"author": "Juliusz Chroboczek", "text": "

This is just the natural continuation.

", "time": "2023-07-28T18:34:23Z"}, {"author": "Nick Doty", "text": "

age verification has free expression concerns, not just privacy concerns. for example, some people in the US would like to prevent children from accessing websites about LGBTQ topics.

", "time": "2023-07-28T18:34:54Z"}]