My Authors
Read all threads
A little thread on The Economist's 2020 Democratic primary polling aggregate, which is showing a big dip for Joe Biden today, but maybe one that's not as large as some people are expecting.

projects.economist.com/democratic-pri…
Throughout the 2020 primary so far our model has been a bit more "conservative" than others (see: RCP, 538) in calculating each candidate's support. Those smoother trends come from a few sources:
First, by design, our aggregate (henceforth called 'the model', as it's really a model of latent voter opinion) needs multiple polls before it decides to shift estimated support for a candidate. This makes it less sensitive to outliers and avoids reacting to noisy data. Why tho?
The theory here is basically that we believe polls are just measurements of some underlying opinions about... stuff (POTUS approval, the economy, the Democratic primary), and that those measurements come with a whole host of errors (from sampling to weighting etc).
We wanted to create a model that was more in line with this theory that polls they tell a story about some reality that lies somewhere *between* the measurements—and a story formulated as such depends on data that can change over time (even for the past).
This is a Bayesian worldview in which one poll won't shift our prior understanding of the race. The model thinks similarly, in essence saying to itself "I need to see 2 or 3 polls outside the current margin of error for a candidate before I shift the trend line".
Second, also by design, the model calculates a "house effect" for each pollster X candidate combination to adjust for the fact that some seem to consistently over/estimate certain candidates. YouGov has been pretty Warren-leaning, for example. We adjust for this.
But that adjustment can sometimes be too large, especially for pollsters than only release 1-2 polls. We've told the model not to calculate any house effects that are bigger than ~4 percentage points for any given candidate, but that can still be too much sometimes.
("Sometimes" here meaning "days like today when people really want the model to adjust to new data but it wants to hang back a bit and wait for a new poll")
Third, the model puts higher weight on polls with higher sample sizes and better track records (per 538). We don't think that all data are created equally and don't want the model to think so, either.
But the weights aren't that aggressive, so on days like today when we only have 2 high-quality pollsters showing huge movement and a few others that aren't that old showing not a lot, the model waits for more data.
You can probably combine all these factors in your head but the upshot is: there's a lot uncertainty here, and we don'd want the model to react too strongly to new data that it is uncertain about.
(Now, we could do something like model historical polling bumps after certain contests and apply them to projections for the future, but then we might as well develop a forecasting model for votes for the primary, which we're not going to do ok????)
The result of these adjustments are, essentially, that the model can be slower than other to react to new data.

You can see this happening now: Joe Biden has just had three polls putting him under 20%, but the model thinks he's closer to 24% (+/- 2).
image rn:
The divergence here is due to (a) not wanting to be too aggressive and (b) typically anti-Biden house effects on the pollsters that have released estimates so far. I don't think this "mismatch" (between what our model is saying and what the conventional wisdom is) will last long.
That's because each time the model runs, it recalculates estimated support on every day of the campaign.

So once we get more data, the trend line for the past few days will shift (IE: downward for Biden), as will the estimate for today's support.
Anyways, as I always say, don't panic about changes in trends that are within the margin of error. Similarly, don't freak out when the "truth" might be right on the edge of the MOE and the model isn't picking it up yet.
Here's a really cool thing though! The model is theoretically equipped to take into account shifting vote intentions at the state level when calculating trends nationally. This makes it a 'hierarchical' or 'multi-level' model. If we do that, the average looks very different:
The model that's up at projects.economist.com/democratic-pri… currently doesn't think this way, but we're considering changing that. We've had a state-level aggregate in the works for a while and if/when we publish it, it probably sense to make this change for the national topline too.
IN SUM: Our model needs more data than others because it is very uncertain about the world. It is currently showing a big dip for Biden and I expect it to get bigger today/tomorrow.
Anyway if you want to crib the model, here's the code
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with G. Elliott Morris

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!