Facebook is a newstand. But no one can see which news Facebook is pushing to the top.
So we built an app for that called #CitizenBrowser. Our first finding: the sharp impact of Facebook’s political ad ban reversal in the Georgia Senate elections. themarkup.org/citizen-browse… /1
This is the first report from our #CitizenBrowser project. There will be many more to come. But first, I want to tell you a bit about how we did it because it’s the most ambitious thing we’ve ever done @themarkup – and we do a lot of ambitious projects. /2
#CitizenBrowser grew out of conversations @suryamattu and I had about how to audit Facebook. We had both worked on browser extensions to collect data from Facebook, but FB always threatened to shut those down, just like their recent threats against the NYU AdObservatory. /3
But the bigger problem is that browser extensions don’t capture most FB activity. Most people use FB on mobile and browser extensions only work on desktop or laptop computers.
So one night, late at the WeWork, @suryamattu said to me: “We’ll have to build our own browser.” /4
“No way,” I said. “That’s insane.” But slowly, I came around. Here’s why: browsers are good at running in the background. Users could log into FB once through the browser and never use it again.
Then we wouldn’t be monitoring users. We would only be monitoring their feed. /5
So we built a browser. Ok, not quite. We built an Electron app that automates data collection using the Chrome web browser. Then we built tools that remove personal identifiers from that data, such as users’ names and their friends’ names, and discard them. /6
Then we hired a survey research provider to invite a nationally representative sample of U.S. adults to install the app. Panelists are paid to participate and yes, they understand the privacy issues—here is the Privacy FAQ we provide for panelists.
Our panel consists of around 1,000 paid participants from 48 states. It's reasonably representative although we could use more Latinos and Trump voters and our crew is older and more educated than the U.S. population, which reflects desktop computer usage. /8
The panel gave us our first peek inside the black box of Facebook’s algorithms. In the past month we saw how news feeds in Georgia changed dramatically when Facebook flipped the switch to turn on political ads. themarkup.org/citizen-browse… /9
And we have much more to share with you in the coming weeks. Sign up for our #CitizenBrowser newsletter to follow along as we try to understand the social media algorithms that shape our lives. themarkup.org/newsletter /10
As always, we show our work. For more details on how we built our app and assembled our panel, read our methodology. themarkup.org/citizen-browse… /11
I’m excited to announce that we have assembled a fantastic team to help us get Citizen Browser launched! Citizen Browser is our ambitious effort to build a national panel to audit social media algorithms: themarkup.org/citizen-browser /1
@corintxt joins the team as Data Reporter - he will be digging through the data to help us find stories. Corin has long worked as a reporter covering technology news. I love this story outing pay-for-play crypto news outlets: breakermag.com/we-asked-crypt… /2
@angiewaller joins the team as Tech Coordinator - she is supporting our panelists & developers. Angie recently finished a masters in Computational Linguistics from @GC_CUNY. Her thesis analyzed objectifying comments in professor reviews: angiewaller.com/detecting-attr…. /3
If Facebook were a TV station, it would be illegal for it to charge different ad prices to the candidates.But Facebook is not subject to the same rules.
Facebook’s response to us was that we don’t understand how ads work:
@suryamattu Blacklight was born from a conversation @suryamattu and I had updating the privacy series “What They Know” that I led ten years ago at @wsj.
What did we find? The Tl;DR: surveillance has become creepier and more difficult to stop.
@suryamattu@WSJ Using Blacklight, @ASankin found that some of the most sensitive websites on the Internet - banks, medical clinics, child safety – were sharing their users personal data with third parties.
SunTrust Bank was sending user passwords to a 3rd party!
Remember when a Google search used to lead you somewhere?
Now it increasingly just keeps you on Google. In fact, Google results take up 62.6% of the first screen of search results in a sample of 15,000 searches.
It wasn't easy to measure Google search results. @LeonYin wrote two custom scrapers and 68 parsers to identify elements on Google search result pages.
As always, all our data, code and an extensive (like REALLY extensive) methodology here: themarkup.org/google-the-gia…
Google's dominance of search results has real consequences. Founder of travel startup Hipmunk told @adrjeffries that Google's decision to boost Google crushed his business.
These materials are not only dangerous - but deadly. In an interview from prison, Eric Falkowski told us that he bought pill presses on @amazon and used them to make counterfeit prescription opioids. His fake pills killed two people and sickened 20 others. /2
Amazon says it catches billions of improper listings a year. But it was pretty easy for us evade its rules. @jonkeegan set up a seller account and listed two weapons parts for sale just by varying the words and codes he used in the listing. /3
They found that these screening companies often use the loosest possible standards for matching names, including so-called “wild-card” searches where the records of anyone whose names shares first three letters similar as yours can be included in your report. /2
Credit bureaus use much stricter standards for name-matching. In 2017, the big three said they would only match records that contained the same name, address and SSN or date of birth. The tenant screening industry has not made a similar commitment. /3