How do people collaborate with algorithms? In a new paper for #CSCW2021, Yiling Chen and I show that even when risk assessments improve people's predictions, they don't actually improve people's *decisions*. Additional details in thread 👇

Paper: benzevgreen.com/21-cscw/ Image
This finding challenges a key argument for adopting algorithms in government. Algorithms are adopted for their predictive accuracy, yet decisions require more than just predictions. If improving human predictions doesn't improve human decisions, then algorithms aren't beneficial.
Instead of improving human decisions, algorithms could generate unintended and unjust shifts in public policy without being subject to democratic deliberation or oversight.
Take the example of pretrial risk assessments. Even though our algorithm improved predictions of risk, it also made people more sensitive to risk when making decisions about which defendants to release. Showing the risk assessment therefore increased racial disparities.
This has implications for how we evaluate algorithms. Rather than testing only an algorithm's performance on its own, we also need to test how people interact with the algorithm. We shouldn't incorporate an algorithm into decision-making without first studying how people use it.
This is a research question I've been wanting to answer since I started working on human-AI interactions several years ago. It cuts right to the heart of arguments for why governments should adopt algorithms for high-stakes decisions.
But it's actually a difficult question to test because you have to account for how algorithms alter people's predictions of risk. So there are some interesting experimental design details in the paper, if you're into that sort of thing.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Ben Green

Ben Green Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @benzevgreen

19 Nov 18
How do people respond to the predictions made by pretrial risk assessments? My new paper (coauthored with Yiling Chen @hseas and forthcoming at #FAT2019) finds evidence for inaccurate and biased responses.

Available here: scholar.harvard.edu/files/19-fat.p…

+ tweet thread summary below:
Research into fair machine learning fails to capture an essential aspect of how risk assessments impact the criminal justice system: their influence on judges. Considerations of risk assessments must be informed by rigorous studies of how judges actually interpret and use them.
So we ran experiments on Mechanical Turk to explore these human-algorithm interactions.

🚨 Caveat: our results are based on Turk workers, not judges. But they highlight interactions that should be studied further before risk assessments can be responsibly deployed. 🚨
Read 11 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(