, 16 tweets, 8 min read Read on Twitter
Testing for bias in Google Translate: a thread. 1/13
Google Translate is extremely useful but it has flaws. For example, how it handles gender. Translating gender-neutral Turkish to English requires that pronouns gain gender, and stereotypes can emerge ("X is an engineer" -> "he is an engineer"). Google is trying to fix this! 2/13
I recently noticed that Google Translate provides suggestions for the source language. Similar to Google web search, these suggestions seem to be based on popularity. "she is pretty" is a common sentence for someone who wants to learn another language, right? 3/13
Naturally, I got curious and started testing how gendered the suggestions are. The farther down the rabbit hole you go, the worse it gets. Why should "he" have a car while "she" has a boyfriend? Maybe she would trade in her boyfriend for a car, given the chance. 4/13
This problem persists when you try other common verbs like "work" and "take." Naturally, she takes care of the kids and works in a hospital, while he takes a shower and works in his workshop (sidenote: how many men own a workshop?). 5/13
Let's get into more explicit gender stereotypes. What are the most basic assumptions that Google Translate has about men and women? Women may be property, but men are trash, so...equality? 6/13
In general, I found the "boys" queries to be less biased than the "girls" queries, but not uniformly. 7/13
I tried out some family terms, hoping that they would be more neutral or at least PG. I was very wrong. In what world is "your mother is practicing prostitution" an acceptable suggestion? 8/13
For the record, I checked with COCA (Corpus of Contemporary American English) on some of the search queries. Note how "your mother is practicing" is nowhere near the most frequent n-grams. 9/13
The rabbit hole doesn't end with gender. There's some wild results when you start testing for racial bias. "black people have swag"? "Latino gang"? How should a user react when the suggestions look like this? 10/13
I don't know how these suggestions are generated. I suspect that it's based on search popularity with heavier weighting toward recent searches. This would explain Google Translate's response to feminism. 11/13
When a system like this promotes a stereotype, it hurts the people at the receiving end of the stereotype and may encourage people who use the stereotype to cause harm. If Google Translate suggests that women are property, then a troll can use that as ammo for an argument. 12/13
I recently spoke about implicit bias in NLP (), based on my understanding of current work. Systems like Google Translate, which are public and used for everyday communication, must be audited for bias and improved to treat all humans with respect. 13/13
Addendum 1: if anyone tells you that linguistics training is useless, prove them wrong. We need to seriously assess the internal workings of NLP systems as they become more pervasive, and no one is better equipped to do that than a linguist.
Addendum 2: I only tested English suggestions, but I have no doubt that the problem persists in other languages. Feel free to reply with examples from other languages!
Addendum 3: for more info on how Google Translate has tried to fix gender bias in the past, check this out blog.google/products/trans…
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Ian Stewart
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!