Profile picture
, 25 tweets, 12 min read Read on Twitter
THREAD: After months of data scraping and number crunching, this week we’ve published an analysis of whether @Google's news algorithm displays political bias. (1/25) economist.com/graphic-detail…
.@realDonaldTrump has often claimed that the search engine discriminates against right-leaning publications, because so many of the results for searches about "Trump" come from @nytimes and @CNN. (2/25)
To test whether he was right, we created a computer program that could collect the first page of results on Google's news tab (within its main search engine) for any keyword on any day. (3/25) google.com/search?q=trump…
To make sure that our results weren't being tailored to our personal internet profiles, our computer program used a browser with no history. We operated it from a server based in a swing district in Kansas. (4/25)
For every day in 2018, we scraped the first page of results on Google's news tab, for a selection of 31 keywords across a range of political, economic and newsy topics. That gave us a sample of 175,000 article links. (5/25)
We then picked a broad sample of 37 popular publications - stretching from @DailyKos and @thedailybeast on the far left to @BreitbartNews and InfoWars on the far right - and counted what percentage of these 175,000 links they got. (6/25)
To work out whether the results skewed left or right, we needed a measure of where each publication sits on the spectrum. We combined ratings from two amateur fact-checking websites, AdFontesMedia.com (@vlotero) and MediaBiasFactCheck.com. (7/25)
It was clear that publications which rated as left-leaning got a higher share of Google results than right-leaning ones. (In the chart below, we scaled the share of the 175,000 links gained by these 37 publications to 100%, and stacked them from left to right.) (8/25)
The obvious explanation might seem to be that Google's algorithm is biased against right-leaning publications. Google denies this. And there might be some non-ideological reasons why @washingtonpost got three times as many links as @FoxNews. (9/25)
In fact, Google has a long list of criteria that it gives to its 10,000+ human "search quality evaluators", who rate websites according to their "expertise", "authoritativeness" and "trustworthiness", among other things. (10/25) static.googleusercontent.com/media/www.goog…
Among the measures that it asks its raters to consider are Pulitzer prizes, "ratings from independent organisations", and public opinion. (11/25)
So we decided to see if we could predict what share of Google's results a publication ought to get, using the apolitical criteria that the company mentions going into its algorithm. (12/25)
The first variables that we put into our model measured "trustworthiness". We combined ratings from those two fact-checking websites to give each publication an accuracy score. And we asked @YouGov to poll 1,500 Americans about how much they trust each publication. (13/25)
Then we added the number of @PulitzerPrizes each publication has won, whether it was print/broadcast/online and if it had a paywall. (Online-only publications did worse; having a paywall was associated with slightly fewer links, after controlling for everything else.) (14/25)
Next we included how much American web traffic (via @SimilarWeb) a publication gains from sources other than Google, as a proxy for its national audience, and also its tally of Facebook followers. Websites with little traffic but lots of FB fans got few search results. (15/25)
Finally, we accounted for how often our selected 37 publications wrote about each of these 31 keywords, using data from @Meltwater. In 2018, for example, @CNN published 3.3 times as many articles mentioning "Mueller" as @FoxNews did. (16/25)
Was our model any good? It did a reasonable job of predicting how many Google links each publication would get. (Specifically, it could account for nearly four-fifths of the differences between how often publications appear in Google's news tab.) (17/25)
But crucially, when we compared our model's predictions to the actual share of the 175,000 Google links that each publication received, we found no evidence that right-leaning sites did worse than expected. (18/25)
In other words, once we accounted for how trustworthy, reputable and popular a news organisation is, knowing where it sat on the political spectrum made no difference to our predictions. (Adding the variable to our model did not improve its accuracy.) (19/25)
The likely reason, therefore, that right-leaning websites get less exposure on Google's news tab is not their ideology, but the fact that they score less well on measures of accuracy and authority than their left-leaning equivalents do. (20/25)
(For those of you wondering: the composite ideological measure from the fact-checking websites put @TheEconomist on the centre-right, near @FT and @Forbes. We slightly underperformed relative to our very small expected level of Google visibility.) (21/25)
Of course, the results varied substantially depending on which keyword we looked at. If you play around with the fantastic interactive built by @martgnz, you will see that left-leaning publications were indeed strongly overrepresented on searches for "Trump". (22/25)
But when it came to articles which included the word "liberal", right-leaning publications got far more links than you would expect. (23/25)
All of which suggests that one of the most important variables in Google's algorithm is how interesting an article is - or at least, how likely it is attract clicks. Left-leaning publications write incendiary articles about Trump; right-leaning ones do so about liberals. (24/25)
This is not definitive proof that Google's algorithm is politically unbiased. If Pulitzer judges or fact-checkers are skewed, then our variables (and Google's human raters) could be too. Perhaps a different sample of keywords or publications would give different results. (25/25)
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to James Tozer
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!