Paul Hünermund Profile picture
Jun 24, 2022 13 tweets 6 min read Read on X
This is my favorite teaching example for showing the importance of #CausalInference: @Google conducts an annual pay equity analysis in which they use fairly advanced statistical techniques. In 2019 they found that they were actually underpaying MEN?! npr.org/2019/03/05/700… 1/ Image
What do they do specifically? They collect a lot of data (as Google does) and then run OLS regressions of annual compensation on demographic variables (gender, race) and other explanatory variables such as tenure, location, and performance. services.google.com/fh/files/blogs… 2/ Image
If they find statistically meaningful differences, @Google is actually committed to make upward adjustments for the disadvantaged groups. In this case it was male, level-4 software engineers who got a raise. 3/ Image
But here comes the problem: Google runs these regressions separately for specific groups of employees, based on their job level and function. They do this to avoid comparing 🍎 with 🍐. And why wouldn't you? 4/
Well, we know that adjusting for a third variable can sometimes do funny things to the sign of a statistical relationship. This is the famous Simpson's paradox, named after the British statistician Edward Simpson (another white dude). everydayconcepts.io/simpsons-parad… 5/ Image
It could very well be that women are overall paid less at an organization like Google, but if you adjust for a third variable like job level or function, the sign flips and suddenly you get the exact opposite direction for the relationship. 6/
To find the right answer, we cannot simply look at the data, because there is nothing in it that can tell us how to properly analyze it — no matter how large it is and how finely we can slice it. We need to make causal assumptions! 7/
Variables such as job level and function are likely affected by gender, because we know from prior literature that there are, e.g., child penalties for women and gender-specific occupation choices. This turns them into so-called "post-treatment variables". 8/ Image
At the same time, there might be many determinants of an employee's job level and compensation that even @Google can't observe in their vast data. One prime candidate for such unobservables are personal job-related skills, which we often only have rough proxies for. 9/ Image
But if we now want to estimate the effect of gender on compensation, job level becomes a collider. If we control for it, by running separate regressions for each job level, we create a bias that stems from the fact that employees with higher skills receive higher salaries. 10/ Image
The intuition here is that women have more obstacles to overcome to make it to higher-level positions. Those women that make it nonetheless are often a specifically selected group with likely higher skills than average. This higher skill level pushes their annual pay. 11/
So especially in groups with higher seniority you will find women that consistently over performed throughout their career to make it this far. It is therefore not surprising that they might also receive, e.g., higher bonuses than their male peers. 12/
More on these causal inference challengenes and the dangers of estimating the gender wage gap with sophisticated ML methods without a proper theory behind it, can be found in this paper: arxiv.org/abs/2108.11294 Thanks for reading! 13/13

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Paul Hünermund

Paul Hünermund Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @PHuenermund

Mar 13, 2023
Since @Andrew___Baker called for a break day, let's go back to our favorite Twitter activity of 2020... discussing DAGs! I'm very happy that our paper "Causal Inference and Data Fusion in Econometrics" is finally forthcoming in the Econometrics Journal. academic.oup.com/ectj/advance-a… 1/
In this paper, we review the advances that have been made in the causal AI literature in recent years and discuss their value for empirical work in econometrics and adjacent disciplines (such as political science, sociology, and management). 2/
We're not the first to discuss DAGs from an econometric perspective. Several famous scholars, including Jim Heckman, Hal White, and Dan McFadden were engaging with the topic before. Perhaps most notably, Guido Imbens published his comparison of .. 3/ aeaweb.org/articles?id=10…
Read 10 tweets
Sep 28, 2022
We just posted a substantially expanded version of our paper "On the Nuisance of Control Variables in Regression Analysis" (w/ @beyers_louw): arxiv.org/abs/2005.10314

Main message: Don't bother reporting the coefficients of controls, because they are likely to be biased anyway. Image
Citations for the arXiv version are coming in nicely, so people seem to find the paper useful. The succinct format as a research note seems to be appreciated too. But some of the more intricate aspects of the argument might have been a bit glossed over in the previous version.
In the new version, we have expanded the theory part. We now show more DAGs and simulations that demonstrate under which conditions estimated effect sizes of control variables can be interpreted causally. ImageImage
Read 6 tweets
Jul 12, 2022
Jetzt kann man natürlich der Meinung sein, dass es keine gute Sache ist, wenn Professor:innen so viel nebenbei machen. Für den Wissenstransfer muss das aber gar nicht so schlecht sein. 🧵 1/9
Eine interessante Fallstudie dazu liefert die Abschaffung des sogenannten "Professorenprivilegs" in 2002. Mein ehemaliger Advisor an der KU Leuven, Dirk Czarnitzki, hat dazu ein interessantes Papier. papers.ssrn.com/sol3/papers.cf… 2/9 Image
Das Professorenprivileg erlaubte es Lehrstuhlinhabern, anders als anderen Angestellten nach dem deutschen Erfindergesetz, über die Vermarktung von Erfindungen die während der Ausführung der beruflichen Tätigkeit gemacht werden, frei zu entscheiden. 3/9 Image
Read 9 tweets
Jul 12, 2022
Happy to see our paper "The Choice of Control Variables: How Causal Graphs Can Inform the Decision" (w/ @beyers_louw & M. Rönkkö) included in the best paper proceedings of the 82nd Annual Meeting of the Academy of Management. #AOM2022 @AOMConnect journals.aom.org/doi/epdf/10.54… 1/5
We present practical recommendations on how to choose suitable control variables for regression analyses – a topic which seems to cause quite some confusion in the management literature (if you ever read the phrase "if in doubt leave out" you know what I'm talking about). 2/5 Image
The best paper proceedings include abridged versions (max. 6 pages) of selected papers that will be presented at #AOM2022. Our session (#1088) is scheduled for Aug 8 2022 from 8:00AM to 9:30AM local Seattle time. You are all very welcome to join! 3/5 Image
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(