Kawin Ethayarajh Profile picture
PhD student @stanfordnlp; @facebook Fellow in NLP

Jun 23, 2020, 8 tweets

Is your NLP classifier actually (un)biased? Or is your diagnosis based on too little data?

It might be the latter!

In my #ACL2020 paper, I discuss why we need bigger datasets for conclusively identifying classification bias in NLP.

arxiv.org/abs/2004.12332 1/

Background: Large NLP datasets don't come with annotations for protected attributes (e.g., gender). To test for classification bias, one typically annotates a small sample of data (typically < 5K). WinoBias and WinoGender are great examples of these bias-specific datasets. 2/

Intuitively, the less data we annotate, the less certain we are that our estimate is close to the true bias. But how can we quantify this uncertainty? 3/

I suggest framing classification bias as the difference in expected cost across the protected and unprotected groups. Difference fairness measures (e.g., equal opportunity) have different cost functions. 4/

By treating an example's contribution to the overall bias as a random variable, we can apply Bernstein bounds to come up with a confidence interval for the overall bias.

This proposed approach is called Bernstein-bounded unfairness (BBU for short). 5/

BBU suggests that bias-specific datasets used in NLP are often too small to conclusively identify bias.

For example, say a coref resolution system is 5% better on gender-stereotypical inputs. To claim bias with 95% confidence, we'd need a dataset 3.8x bigger than WinoBias! 6/

Takeaways / Future Work:

#1 The NLP community needs much larger bias-specific datasets so that we can make claims about the presence / absence of bias with high confidence.

#2 Tighter bounds on bias estimates can help us make more confident claims with less data.

7/

This work was done at @stanfordnlp and @StanfordAILab. Many thanks to @sulin_blodgett, @haldaume3, @s010n, and @hannawallach for including this in their recent lit review! arxiv.org/abs/2005.14050

8/

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling