Hey all! I'm uncomfortable with my own model right now, because I think, after a lot of debugging and analysis, that it overfits *quite a bit* in areas like Zapata (in TX).

I'm not comfortable leaving it up till I fix it, so I'll spend some time doing so. Taking it down for now
While I really enjoy doing analysis, one thing I absolutely cannot and will not do is leave out something that I think may be off or wrong in any way, shape, or form.

If it overfits that much in Zapata, why would it not overfit elsewhere?
I'm absolutely and completely uncomfortable leaving it up if that's the case. A fix for this is something that I need to figure out (thanks @rainbow_jeremy_ for the catch). Demographic change alone shouldn't explain it. So perhaps the model is just overfitting.
The problem here is as follows: if it *does* overfit, then might we be overamplifying the (already extremely predictive) power of demographics? And wouldn't that undercut the original argument?

Not to mention that it's kind of dishonest to leave it up.
It's only a problem with the latest iteration of the model dropped today, IMO -- at least, I'm pretty confident of that. So it goes down, and I spend some time fixing it. When I drop it again, I expect the accuracy to plunge a bit. That's okay. I'd rather be rigorous and thorough
Anyways, my apologies for the potential mistake on my part, and many thanks to you all for debugging my outputs and sharing your insights with me. They all really, truly help, and I appreciate all of it. I'm going to fix this, and then I'll hopefully show some better stuff soon!
I'm leaving the *previous* iterations up (e.g. my thread on the sunbelt) because no such issues exist there, to the best of my knowledge. This is only a problem with the latest rebuilt version of the model, and that's the one I'm taking down. Everything else posted still holds
e.g. this model is still correct and doesn't overfit (but it's R^2 is nowhere near what i want it to be so I need to figure out how to account for complexity without just overfitting like crazy)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Lakshya Jain

Lakshya Jain Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @lxeagle17

6 Mar
Cal Cunningham would not have won even if he kept his zipper up.

Biden didn't win the state and shouldn't have, based on the national environment, and in an inelastic state like NC, expecting the senate candidate to run >1.5% of Biden in an open seat is just unrealistic.
Biden only did 0.4% below average in NC. It's not as if he had some shocking underperformance there! He would still have lost by around a percent if everything went to plan in the state and it went exactly according to the national environment.
So why would we expect the downballot candidate to outrun Biden by a percent against a sitting incumbent? That's a tall ask for any normal candidate anyways.

Cunningham isn't great, but that's not why we lost the seat.
Read 5 tweets
1 Mar
Ran a model on the South -- the map below shows the partisan performance by county *relative to what was expected given the 2020 environment* when using race, religion, education, and 2016 partisanship as underlying variables.

The Florida-Georgia contrast is striking.
Florida is a complete disaster. Democrats underperformed in Broward by 4.5%, Palm Beach by 5.1%, and Miami-Dade by 13.2%. Nowhere did they exceed the modeled swing by more than 3.3%.

Doesn't matter if the opponent is Trump, DeSantis, or Rubio. If this is how you do, you're toast
"Well, Hispanics swung right everywhere! The national environment we were dealing with was way different from what was expected!"

No. *Even accounting for all that*, Democrats absolutely collapsed in Florida this year and were below replacement level. There is no defending this.
Read 7 tweets
1 Mar
Ossoff outran Abrams in the vast majority of counties and unseated a popular incumbent in a race nobody wanted to initially even contest, whereas Abrams couldn't win an open seat.

Abrams lost Georgia by 2 points in a blue wave and she should have won that race.
I have nothing but respect for her work fighting voter suppression and I immensely appreciate the effort of organizers, but stop pretending Abrams was some electoral goddess. There is nothing to suggest anything of the kind and a lot that suggests she underperformed in 2018.
Have a look at this for how Ossoff outran Abrams.

Read 5 tweets
28 Feb
Created a regression model to analyze 2016->2020 swing based on demographics. Much of the swing in South Texas can actually be explained with education, race, urbanization, and religion when analyzing the Sunbelt.

But even then, there was still a definite underperformance there. Image
Blue means the area swung more towards Biden than expected given demographics. Red means the area swung more towards Trump than expected based on demographics.

Now, a bit of analysis...and what happened is far more complex than anyone wants to believe. There's no easy answer.
To my eye, the populous centers in South Texas would have seen a pretty big swing right either way. Hidalgo and Webb should have swung 17 points right instead of the 23 and 28 point margin.

So there should have been a big swing right. But a swing of this magnitude? Maybe not.
Read 18 tweets
26 Feb
Not even a competition. Get a grip, folks.
Again, Joe Manchin’s value over replacement is insane. Look at the gap between him and Capito in terms of voting with Democrats and realize that without Manchin, we’re praying Biden can just confirm a cabinet.
Sinema is a bit more annoying for me, but she was elected when AZ was flipping from red to blue and had to appeal to a whole ton of folks across the partisan spectrum. And she hasn’t actually broken with Democrats on any vote of consequence.
Read 4 tweets
22 Feb
So, do people think a lack of canvassing only means a turnout differential?

A lack of voter outreach also means folks can switch their vote too, because they just don’t see one side at all.
I said that the GOP drew heavily on low propensity voters turning out to boost their votes in the RGV — in essence, giving you some type of “manufactured swings”. But it’s not like that’s independent from a pool of voters switching their vote.
The word here is “key”. No one denies that there was some swing — how else could Biden underrun Clinton’s raw vote total in Starr?

But that swing is also likely connected to a lack of outreach and canvassing that saw the new GOP voters drown out the new Democratic votes.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!