When a machine learning system uses argmax to select outputs from a probability distribution — and most of them do — it's a clue that it might be biased. That's because argmax selects the "most probable" output, which may amplify tiny data biases into perfectly biased outputs.
Here's an exercise (with solution) I developed for my Fairness in ML course with @ang3linawang's help. It uses a toy model to show how bias amplification like the one in the "Men also like shopping" paper can arise through the use of argmax alone! drive.google.com/file/d/1baK_c4…
This graph is the punchline. α and β are parameters that describe correlations in the input and the graphs show correlations in the (multilabel) output. It should be terrifying from a scientific and engineering perspective even if there are no relevant fairness considerations!
The nonmonotonicity & discontinuity mean that systems that use argmax to select outputs from a probability distribution can behave unintuitively and unpredictably. All bets are off if there is a distribution shift when the model is deployed. (There's always a distribution shift.)
Remember how Twitter's image cropping algorithm preferred lighter-skinned people? Twitter investigated and found that the use of argmax was one reason.
See Section 3.5 arxiv.org/pdf/2105.08667….
Not the only reason of course, but it's a bias amplifier.
Remember gender biases in online translation? As long as argmax is used, these will probably persist. Even if the training corpus is congruent with a particular stereotype 51%-49%, the output will reflect the stereotype every time.
There's a subtle reason why argmax is a severe problem for multi-label classification, structured prediction, sequence prediction, etc., and much less serious for vanilla classification. This means it crops up all the time in computer vision, NLP, and recommender systems.
I don't know any effective technical fixes for the argmax issue. But a really good approach for handling model uncertainty is to show the user the possible outputs and ask them to choose. But this goes against the mantra of frictionless design and so it's very rarely deployed.
Generalizing a bit, a recurring theme in my course was that many fairness issues in online services can probably only be effectively addressed by **co-designing the user interface and the ML backend**. This of course violates core principles of sound engineering.
It's time for the fairness conversation to move beyond narrow questions of data and algorithms and grapple with thornier issues like the fact that design values and engineering best practices (which are in turn shaped by business models) often get in the way of mitigating biases.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Arvind Narayanan

Arvind Narayanan Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @random_walker

23 May
A remarkable thread about messed up corporate power hierarchies. It's worth highlighting something else the story illustrates: the standard way to "solve" online abuse and harassment is to experiment on the victims of abuse and harassment with no consent or transparency.
No surprise here, of course. We all know this is how tech platforms work. But should we take it for granted? Is there no alternative? No way to push back?
It's not A/B testing itself that's the problem. Indeed, in this instance, A/B testing *worked*. It allowed @mathcolorstrees resist a terrible idea by someone vastly more powerful; something that would probably have made Twitter's abuse problem much worse.
Read 5 tweets
20 May
This brilliant, far-too-polite article should be the go-to reference for why "follow the science" is utterly vacuous. The science of aerosol transmission was there all along. It could have stopped covid. But CDC/WHO didn't follow the science. Nor did scientists for the most part.
The party line among scientists and science communicators is that science "self corrects". Indeed it does, but on a glacial timescale with often disastrous policy consequences. Our refusal to admit this further undermines public trust in science.
See also @Zeynep's excoriation of public health agencies, including the comparison of their covid responses with the way 19th century Londoners afraid of "miasma" redirected sewers into the Thames, spreading Cholera even more nytimes.com/2021/05/07/opi…
Read 4 tweets
17 May
The "tech" part of tech companies has gotten easier while understanding its social impacts has gotten much harder. This trend will only accelerate. Yet most tech co's have resisted viewing ethics as a core competency. Major changes are needed, whether from the inside or outside.
I love pithy analogies but this one breaks down quickly. The world will be better off without fossil fuels. But a world without computing technology is outside the Overton window. Like it or not, we must work to reform the tech industry.
I like it. I'd love to read a detailed take on what tech activism can learn from food and environmental activism.
Read 4 tweets
3 May
35 million U.S. phone numbers are disconnected each year. Most get reassigned to new owners. In a new study, @kvn_l33 and I found 66% of recycled numbers we sampled were still tied to previous owners’ online accounts, possibly allowing account hijacking. recyclednumbers.cs.princeton.edu
It’s well known that number recycling is a nuisance, but we studied whether an adversary—even a relatively unskilled one—can exploit it to invade privacy and security. We present 8 attacks affecting both new and previous owners. We estimate that millions of people are affected.
Unfortunately, carriers imposed few restrictions on the adversary’s ability to browse available numbers and acquire vulnerable ones. After we disclosed the issue to them a few months ago, Verizon and T-mobile improved their documentation but have not made the attack harder.
Read 6 tweets
27 Apr
At Princeton's Center for Information Technology Policy (citp.princeton.edu) we're hiring our first ever communications manager. Public engagement is a first-rate goal for us, so we are looking for someone to work with us to maximize the public impact of our scholarship.
To explain how CITP differs from most academic groups, I'm happy to share a new case study of our (ongoing) research on dark patterns. It includes many lessons learned about conducting and communicating tech policy research effectively, and how CITP helps. cs.princeton.edu/~arvindn/publi…
The communications manager is a hybrid role. This includes familiar tasks such as managing a website and social media, but also close collaboration with researchers on tasks such as co-authoring an op-ed or figuring out the right analogy to explain a tricky concept.
Read 5 tweets
17 Feb
Were you told that successful researchers must constantly "keep up" with research in their fields? In my experience, that's both stressful and inefficient to the point of being useless. New papers may be released every day but actual knowledge doesn't accumulate at that rate.
Paying too much attention to the so-called cutting edge of research leads to a loss of perspective and results in incremental work more often than not. If you want to do foundational research, it probably doesn't rely on a preprint that was released last week.
Here's the process I've used for about 10 years. When I see a new paper, I put it in a big list organized by topic. I don't read it right away. Once in a while, I notice that a collection of papers on a topic have resulted in meaningful progress, and I read the papers together.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(