Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Carlisle Rainey

Aug 1, 2023 • 30 tweets • 10 min read • Read on X

Scrolly

Do you use logistic regression? If so, you’ll want to read the thread below.

⚠️ Warning: Memes, charts, #rstats, and practical advice ahead.

If you don’t like Twitter shenanigans, I’ll give it away.

For logit models with small-to-moderate samples (maybe N < 1,000), you should consider Firth’s penalized estimator.

I talk about it in this paper with Kelly McCaskey (open access!).

cambridge.org/core/journals/…

Now on with the thread!

Logit models are really common in the social sciences. We typically use maximum likelihood (ML) to estimate these models. But the excellent properties of these models are mostly asymptotic.

However, these estimates might not be well-behaved in small samples. In particular, some folks are concerned about small sample bias in logit models. And that’s a real thing.

(But I don’t think it’s the most important problem—keep reading.)

The figure below shows the percent bias in the coefficient estimates for different constants and numbers of explanatory variables (k) as the sample size varies. It’s hardly negligible, but it disappears quickly.

Fortunately, David Firth came along and suggested a *penalized* maximum likelihood estimator that eliminates almost all of this bias.

jstor.org/stable/2336755

If this seems familiar, it should. Zorn’s (@prisonrodeo) (2005) paper is a classic in political science methods classes, and he recommends Firth’s penalty to deal with separation.

cambridge.org/core/journals/…

Here’s what Firth’s penalty looks like. You just maximize the penalized likelihood L* rather than the usual likelihood L.

And it really works! Here’s a comparison of the percent bias in the ML and PML estimators. You’ll see that Firth’s penalty just wipes most of the bias away.

BUT WAIT!!!! 🛑

If you’re clever, you’ll ask about variance. Most of the time, when you reduce bias, you increase variance. You have to choose!

But that’s not what happens here.

When you use Firth’s logit, you shrink *both* bias and variance.

That means you don’t have to choose between bias and variance. You can reduce BOTH.

Here’s a figure showing how much more variable your estimates will be if you use ML rather than Firth’s PML.

But even more importantly, it turns out that bias isn’t the big problem in the first place. The shrinkage in the variance is much more important than the reduction in bias.

In many common scenarios, the variance might contribute about 25 times more to the MSE than the bias (or higher).

So you shouldn’t really be using PML to reduce bias; you should be using PML to reduce *variance* (and bias).

All of this means that you should usually use *penalized* maximum likelihood to fit logistic regression models.

As a default, Firth’s penalty makes much more sense than the usual maximum likelihood estimator.

In practice, that means using the {brglm2} package rather than glm().

And Twitter will love this! {brglm2} works with @VincentAB’s {marginaleffects} package and @noah_greifer’s {clarify} package.

And it can make a big difference! Here’s a comparison for a small data set from Weisiger (2014).

Paper here:

Code here: journals.sagepub.com/doi/pdf/10.117…
gist.github.com/carlislerainey…

Here's the plot

In short, I think Firth’s PML is usually preferable to ML for fitting logit models. It’s always better in theory (smaller bias and variance), easy to implement (brglm2), makes BIG difference in small samples, and a meaningful difference in much larger samples (e.g., N = 1,000).

twitter.com/IKosmidis_

If you’re interested in this topic, then I recommend the work of Ioannis Kosmidis (@IKosmidis_).

ikosmidis.com
twitter.com/IKosmidis_

And here’s a nugget for #econtwitter. For a simple treatment/control design with a binary outcome, Firth’s logit produces a better estimate of the ATE than OLS.

I’ve got lots more thoughts on this that I might put in a blog post, but for now, here are two takeaways.

<1> This “small sample” problem is a problem for even larger samples (perhaps larger than 1,000).

<2> The real problem isn’t bias; the problem is variance.

If you’re interested, here’s the paper (with Kelly McCaskey) that describes all the details. It’s open access.

cambridge.org/core/journals/…

https://twitter.com/sp_monte_carlo/status/1686512695107874818

And a little tidbit that popped up elsewhere.

Super interesting.

https://twitter.com/sp_monte_carlo/status/1686512695107874818

https://twitter.com/IKosmidis_/status/1686687290129580032

Nice links here!

https://twitter.com/IKosmidis_/status/1686687290129580032

And here’s a really clever application of Firth’s method to panel data in political science from @CookScottJ.

Ungated:

Journal (@PSRMJournal): sites.pitt.edu/~jch61//PS2740…
cambridge.org/core/journals/…

https://twitter.com/IKosmidis_/status/1696866496071234026

On Firth and GLMM:

https://twitter.com/IKosmidis_/status/1696866496071234026

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @carlislerainey

Carlisle Rainey

@carlislerainey

Apr 17, 2024

Here's a few recent papers you might find helpful or fun if you're interested in replication in social science.

These papers help us answer the question: Given finite resources, what papers should we replicate?

A thread 🧵

Paper #1

"Making replication mainstream"

DOI:

This is a nice paper with commentaries (36!!) and a response. I really enjoy these and this is a good one!

The Ioannidis comment is especially 🔥.

See also Kochari and Ostarek, and others. doi.org/10.1017/S01405…

Paper #2

"Deciding what to replicate: A decision model for replication study selection under resource and knowledge constraints"

DOI:

This is a super-clear, helpful paper. psycnet.apa.org/doi/10.1037/me…

Read 14 tweets

Carlisle Rainey

@carlislerainey

Feb 16, 2024

🆕 "The Data Availability Policies of Political Science Journals" is on @socarxiv

1️⃣ 20% of political science journals require sharing data.

2️⃣We should remain mindful of the effectiveness and rarity of requiring data sharing.

👇links + discussion below

You can find the preprint on @socarxiv; the paper is currently under review.

CC: @RoeHarley

osf.io/preprints/soca…

The project emerges out of a long-term interest I have in the purpose of reproduction archives, how we can maximize their value, and how we can minimize waste.

Read 14 tweets

Carlisle Rainey

@carlislerainey

Sep 1, 2023

🚨 New Paper (Open Access!) 🚨

Twitter title: "{marginaleffects} Does This Thing Well (And You've Never Fully Appreciated It)"

doi.org/10.1017/psrm.2…

I've taught MLE for a long time. Since the start, I struggled to connect the carefully constructed theory of MLE with the "average-of-simulations" point estimate of King, Tomz, and Wittenberg.

I had to tell students: "this kinda works." 🤷‍♂️

This paper makes the connection.

Aside: it's hard to describe how much King, Tomz, and Wittenberg improved statistical practice. IMO, political science is miles ahead of other fields here. Maybe it's {CLARIFY} that finally got people computing easily interpretable quantities? We owe this project a lot.

Read 37 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Carlisle Rainey

Try unrolling a thread yourself!

More from @carlislerainey

Carlisle Rainey

Carlisle Rainey

Carlisle Rainey

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!