, 8 tweets, 8 min read Read on Twitter
@chadloder @kevinriggle @joejerome @SteveBellovin OK, here goes: a true story about social scientists, gay men, and differential privacy.

Not so long ago in the US it was exceedingly difficult to figure out what %age of the population was gay. Being gay was subject to censure and prosecution.
@chadloder @kevinriggle @joejerome @SteveBellovin So some social scientists used a clever mechanism to figure out who was gay, using a room and a coin.

Put a subject in a room alone, give him a coin to flip. If he got heads, he should answer honestly whether he was gay or not.

The clever part comes in if he flipped tails.
@chadloder @kevinriggle @joejerome @SteveBellovin If he got tails, then he should say he as gay, whether he was or not.

So if a subject said he was gay, then he probably just flipped tails on the coin and wasn't gay at all. Each answer on its own doesn't tell you anything interesting, but combined those answers tell a story.
@chadloder @kevinriggle @joejerome @SteveBellovin If you asked enough people, then when you added up all the answers you can figure out what %age of the population was gay: if you got 50+x% gay answers, then that 50% came from the coin flips and the x% came from real gay answers. Because you really only sampled half --> 2x% gay
@chadloder @kevinriggle @joejerome @SteveBellovin So by applying randomness, the researchers got an answer without learning more than a teeny smidge about any one person.

That's essentially what differential privacy does. That "teeny smidge" corresponds to what's called the "privacy budget".
@chadloder @kevinriggle @joejerome @SteveBellovin Note also that randomness is (unsurprisingly) random. That means it'll mess with your data like whoa. In this experiment, if they didn't have a lot of research subjects, then the percentage of tails coin-flips could be quite far from 50% and their answer would be wrong.
@chadloder @kevinriggle @joejerome @SteveBellovin With differential privacy, you need a *lot* of data in order to maintain a reasonable (meaning small) privacy budget and still get reasonably accurate results.
@chadloder @kevinriggle @joejerome @SteveBellovin So that's a hand-wavy version of the intuition underneath differential privacy, missing most of the math. There are lots of cool math tricks, but in the end it's all about hiding one person's answers while making it possible to get a (noisy) combined answer to certain kinds of ?s
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Lea Kissner
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!