Is your NLP classifier actually (un)biased? Or is your diagnosis based on too little data?
It might be the latter!
In my #ACL2020 paper, I discuss why we need bigger datasets for conclusively identifying classification bias in NLP.
arxiv.org/abs/2004.12332 1/
Background: Large NLP datasets don't come with annotations for protected attributes (e.g., gender). To test for classification bias, one typically annotates a small sample of data (typically < 5K). WinoBias and WinoGender are great examples of these bias-specific datasets. 2/
Intuitively, the less data we annotate, the less certain we are that our estimate is close to the true bias. But how can we quantify this uncertainty? 3/
I suggest framing classification bias as the difference in expected cost across the protected and unprotected groups. Difference fairness measures (e.g., equal opportunity) have different cost functions. 4/
By treating an example's contribution to the overall bias as a random variable, we can apply Bernstein bounds to come up with a confidence interval for the overall bias.
This proposed approach is called Bernstein-bounded unfairness (BBU for short). 5/
BBU suggests that bias-specific datasets used in NLP are often too small to conclusively identify bias.
For example, say a coref resolution system is 5% better on gender-stereotypical inputs. To claim bias with 95% confidence, we'd need a dataset 3.8x bigger than WinoBias! 6/
Takeaways / Future Work:
#1 The NLP community needs much larger bias-specific datasets so that we can make claims about the presence / absence of bias with high confidence.
#2 Tighter bounds on bias estimates can help us make more confident claims with less data.
7/
This work was done at @stanfordnlp and @StanfordAILab. Many thanks to @sulin_blodgett, @haldaume3, @s010n, and @hannawallach for including this in their recent lit review! arxiv.org/abs/2005.14050
8/
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
