, 12 tweets, 3 min read
My Authors
Read all threads
Why should we study bias and fairness in NLP? Tutorial session led by Margaret Mitchell, @vinodkpg, @kaiwei_chang & @bluevincent. #emnlp2019
Human biases in data! 1) selection data: a) selection does not reflect a random sample, gender bias in Wikipedia and britannica; b) turkers are mainly from n America; 2) biased data representation: a) some groups are less represented less positively than others;
3) biased labels: a) annotations reflect world views of annotations eg. Western wedding ceremonies vs. other wedding ceremonies.
Biased in interpretation! 1) confirmation bias : tendency to search for info in a way that confirms one’s beliefs: 2) over generalization
3) correlation fallacy: confuse correlation w causation eg. women allowed to vote & we had two world wars; 4) automation bias: humans favor suggestions from automated decision-making over contradictory info
Human bias is everywhere! It’s a vicious cycle where bias is perpetuated throughout the pipeline, this is aka bias laundering.

However bias is not always bad. It can be good, bad and even neutral.
The bad ones. 1) Off the shelf LID systems under represent minority groups & this means that they are systematically excluded even though they should benefit the most from such technology. 2) predicting homosexuality isn’t about facial differences but differences in culture!
3) predicting criminality by revealing personality based on their facial images. 4) predicting toxicity in text: a) unintended biases towards certain identity terms eg. gay, transgender b) names entities eg. I hate Justin Timberlake vs ~ Rihanna, c) mentions of disabilities
Doesn’t all these just reflect our society? Yes. So should we just leave it as it is? Would it harm some particular groups? What kind of harm? Some food for thought.
How should we evaluate fairness & inclusion? 1) intersectional evaluation: compare across subgroups; 2) confusion matrix
How misrepresentation and bias appear in models? This is due to implicit bias as a result of how our society is perceived. For instance there is bias in Wikipedia where only small portion of editors are women. As a result our models are biased.
Implicit association test are used to study implicit biases in humans. The idea is borrowed to test biases on word embeddings, word embedding association test.
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Vivian Lai

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!