My Authors
Read all threads
Health researchers tend to default to a "standard" suite of things to control for: age, gender, and race.

Today, I want to talk a bit about what "controlling" for race means, and why it matters. A later thread will talk about interpreting outcomes differences by race.

1/16
Why control for things at all?

The general idea is that we are trying to isolate one thing by "removing" the impact of other things. So, when we "control" for race, we are trying to remove race, so we can focus on something else.

But of course, it isn't that simple.

2/16
What we are "removing" isn't always so obvious, and when we are talking about race, it's EXTREMELY not obvious.

That's because that race variable comes with the infinite well of everything that comes with race and racism.

3/16
Just the measurement, framing, category selection, can have a HUGE impact on how people respond and what it means.

Not the focus here, but worth noting that "white" is usually the default coded reference frame. That is a problem, both socially and statistically.

4/16
Race is linked to EVERYTHING; where you live, history going back centuries, personal experiences, structural and personal racism and discrimination, etc. Compared to those things, the role of genetics is negligible, but is the thing people tend to focus on.

5/16
So, when you "control" for race, you aren't just controlling for the color of someone's skin or their genes, you are removing everything that happens that is different for folks who are declared to be of one race vs another.

Are you sure you want to do that?

6/16
If you really did want to remove all that, a binary race variable or two isn't gonna cut it.

Besides, race isn't just skin or genetics; race IS those experiences. It's a social construct.

theatlantic.com/national/archi…

7/16
But of course, it can get worse. Much worse.

Often, things work differently in different populations for all kinds of reasons. Sometimes those that are more deprived have bigger impacts. Sometimes things don't work because they rely other factors.

8/16
When we control for race, we often cut those things away. Remember, in stats, white is typically the default. So when things work differently for Black folks (which they often do), controlling for race can make that criticial information go away.

9/16
Then there is the "what does it mean" question. And the answer to that is, in general, I have no idea.

Useful interpretation of coefficients can be hard even with very independent variables, but race is related to EVERYTHING.

10/16
In some cases, it means that the coefficient of interest is translatable as the effect for the "default" race, which is usually white people. In others, it's wildly untranslatable. It's case by case, but you REALLY need to spend the time thinking about it.

11/16
Let's say I want to measure the impact of infection with SARS-CoV-2 on mortality. I decide, for whatever reason, I am going to control for race.

What is likely to happen?

To start, we should think about what factors are likely to increase disease severity.

12/16
Just about everything that is different in the experience of being Black in America would be a potential factor here. Worse access to healthcare, worse experiences when IN healthcare, depravation causing worse underlying health conditions, education, wealth. Everything.

13/16
When we control for race, using white as the default, we often end up underestimating the severity of the disease, because we have in a very literal sense controlled away the experiences of the people who need our help the most.

It's bad stats, but it's much more.

14/16
When we unthinkingly treat race as a nuisance variable (actual term), we are actively trying to ignore these issues, and the people that come with it.

Instead, we need to be facing those differences head on, for our science, and for the people we do science to help.

15/16
Part II of this will probably come in a few days, but we'll explore another aspect of race and statistics:

What do DIFFERENCES by race, or the impact of race/racism, actually mean?

16/16
An addendum: What can we actually do?

There is no "general" solution to this, and it depends on the problem you are looking at.

The basic requirement is that you absolutely have to spend the time thinking through what controlling for race means in your context.

17/16
A good thought exercise to get you going is to stratify your sample by race, and spend the time thinking through the potential reasons you might see (or might not see!) differences.

Another is to bring someone on board who has expertise in race and stats who can help.

18/16
The next thread is going to deal more directly with interpreting difference by race, drawing heavily on @WhitneyEpi's "On the Causal Interpretation of Race," so stay tuned!

Also, thanks to @mcclure_libby for editing help!

19/16
Missing some Tweet in this thread? You can try to force a refresh.

Keep Current with Noah Haber

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!