The #NobelPrize in economics was just awarded to 3 top economists. #EconTwitter seems to be over it, but the data science/ML community is totally missing out!

Here's why Data Scientists should start paying attention and what they can take away 🧵
The prize was awarded to David Card, @metrics52, and Guido Imbens for their monumental contributions to statistical methodology and causal inference.

They used and developed strategies that were a true paradigm shift bridging the gap between data and causation in economics
One part of the prize went to David Card from UC Berkeley.

Card is most well-known for his famous minimum wage study that paradoxically revealed that an increase in the minimum wage did *not* reduce employment. How?

The study applied a strategy called Difference in Differences
Difference in Differences (DiD) is an extremely powerful method that any data scientist should know.

DiD compares outcomes across *groups* and *times*, not just between treatment and control groups (see A/B tests)

i.e. How can we tell the employment trend wasn't just a fluke?
Card's experiment compared NJ and PA, two similar locales. NJ had a minimum wage increase but PA did not. This made 2 groups:

NJ: Treatment group (subject to min wage incr)
PA: Control group (no change in wage)

Then, they compared both group trends before and after
DiD assumes that both groups are subject to the same variables except for the treatment (min wage incr). We call these "Parallel Trends"

Then, we compare the difference between the groups after the treatment. The difference between the trends is called the Treatment Effect
We call studies like this "quasi-experimental"

Although we measure the effect of a particular intervention against a control group, we don't use randomization to select the control group. (Card chose PA as a control)

If assignments are random, we call it a "natural experiment"
DiD seems like a pretty obvious way to run experiments. But for the longest time (pre-1990's) very few people used such thinking!

The results weren't so much as important as this novel use of rigorous experimental design.
Side note: If you're interested in diving deeper into minimum wage, I highly recommend checking out @NeumarkDN's analysis on the subject. Much more nuance in this paper nber.org/papers/w18681
The other 2 recipients, Josh Angrist and Guido Imbens were chosen for their groundbreaking work on Instrumental Variables and, randomized trials, and natural experiments in economics.

But this thread is already a bit too long... Let me know if you want to see that breakdown too
And that's it (for now).

I make threads like this on the regular, covering topics from AI/ML, statistics, tech business breakdowns, and more.

As a quasi-experiment, I'll see how engagement with this thread compares with others I've made :)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Alex Reibman

Alex Reibman Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @AlexReibman

6 Oct
Big tech teams win because they have the best ML Ops. These teams
- Deploy models at 10x speed
- Spend more time on data science, less on engineering
- Reuse rather than rebuild features

How do they do it? An architecture called a Feature Store. Here's how it works
🧵 1/n Image
In almost every ML/data science project, your team will spend 90-95% of the time building data cleaning scripts and pipelines

Data scientists rarely get to put their skills to work because they spend most of their time outside of modeling Image
Enter: The Feature Store

This specialized architecture has:
- Registry to lookup/reuse previously built features
- Feature lineages
- Batch+stream transformation pipelines
- Offline store for historical lookups (training)
- Online store for low-latency lookups (live inferences) Image
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(