Discover and read the best of Twitter Threads about #ACL2020

Most recents (1)

Is your NLP classifier actually (un)biased? Or is your diagnosis based on too little data?

It might be the latter!

In my #ACL2020 paper, I discuss why we need bigger datasets for conclusively identifying classification bias in NLP.

arxiv.org/abs/2004.12332 1/
Background: Large NLP datasets don't come with annotations for protected attributes (e.g., gender). To test for classification bias, one typically annotates a small sample of data (typically < 5K). WinoBias and WinoGender are great examples of these bias-specific datasets. 2/
Intuitively, the less data we annotate, the less certain we are that our estimate is close to the true bias. But how can we quantify this uncertainty? 3/
Read 8 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!