, 12 tweets, 4 min read
My Authors
Read all threads
Heading to Facebook today for a fireside chat with @SolomonMg about The Ethical Algorithm. In preparing I was looking into re-identification attacks against production systems that purport to protect privacy, and read about Aircloak's system Diffix. A short thread. 1/
Diffix provides an interactive system by which users can query data, and returns answers that are perturbed with small amounts of noise. But despite the name, it doesn't promise differential privacy. They are proud of this! On their website they write that: 2/
"Aircloak’s approach has engineered away the need for a privacy budget by producing tailored pseudo-random noise values that do not average away... which in turn leads to the ability to ask as many queries of your dataset as you desire." So they claim to do away with... 3/
One of the main weaknesses of differential privacy: a limited privacy budget! But wait a minute --- the inability to answer an unlimited number of queries isn't a limitation of differential privacy, but what is called the "Fundamental Law of Information Recovery"! 4/
We've known since 2003 from the work of @Kobbini and Dinur that answering sufficiently many queries to sufficiently high accuracy allows one to reconstruct the entire dataset exactly, and so to violate -any- reasonable notion of privacy! So how can Aircloak get around this? 5/
The answer is that they cannot. :-) In fact the 2003 attack works against Diffix, as Aloni Cohen and @Kobbini recently showed: arxiv.org/pdf/1810.05692… Separately, @yvesalexandre and colleagues demonstrated an attack that makes only 32 queries per user: arxiv.org/pdf/1810.05692… 6/
Aircloak has claimed to have changed their system to exploit these specific attacks: aircloak.com/break-my-lifes… but still claim to be able to answer an unlimited number of queries accurately, and to satisfy GPDR privacy requirements. Color me skeptical. 7/
The history of data privacy before differential privacy looked like this: privacy researchers would propose some system of heuristics to anonymize data. Then, clever attackers would come along and find an exploit. The researchers would patch it up, and this would repeat. 8/
It was a losing game for the privacy side. The advent of differential privacy put a stop to this cycle by offering rigorous guarantees. And the concept of accurate access to data as a necessarily limited and budgeted resource is fundamental (as we've known since 2003) 9/
and not specific to differential privacy. The attacks on Diffix were white hat in that they were done by researchers who published their work, but they could have easily been "black hat" and we'd never know. Real guarantees are important when we are worried about adversaries. 10/
Anyhow, anyone interested in reading more about the history of the science of data privacy, the advent of differential privacy, and what it promises and does not promise can read about it in the first chapter of The Ethical Algorithm: amazon.com/Ethical-Algori… 11/11
I pasted the wrong link for the 2nd attack on Diffix: You can find that paper here: usenix.org/conference/use… (Thanks @jugander for noticing!)
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Aaron Roth

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!