Profile picture
Ben Green @benzevgreen
, 11 tweets, 4 min read Read on Twitter
How do people respond to the predictions made by pretrial risk assessments? My new paper (coauthored with Yiling Chen @hseas and forthcoming at #FAT2019) finds evidence for inaccurate and biased responses.

Available here: scholar.harvard.edu/files/19-fat.p…

+ tweet thread summary below:
Research into fair machine learning fails to capture an essential aspect of how risk assessments impact the criminal justice system: their influence on judges. Considerations of risk assessments must be informed by rigorous studies of how judges actually interpret and use them.
So we ran experiments on Mechanical Turk to explore these human-algorithm interactions.

🚨 Caveat: our results are based on Turk workers, not judges. But they highlight interactions that should be studied further before risk assessments can be responsibly deployed. 🚨
Result 1: Participants underperformed the risk assessment even when presented with its predictions. They were unable to effectively synthesize the algorithm's advice with their own judgment.
Result 2: People could not effectively evaluate the accuracy of their own or the risk assessment’s predictions. In fact, participants’ confidence in their performance was *negatively* associated with their actual performance.
Result 3: Participants exhibit behaviors that are fraught with “disparate interactions,” whereby the use of risk assessments leads to higher risk predictions about black defendants and lower risk predictions about white defendants.
Example 1: Experiment participants were 25.9% more strongly influenced by the risk assessment to increase their risk prediction when evaluating black defendants than white ones, leading to a 20.3% larger average increase for black than white defendants due the risk assessment.
Example 2: Participants were 36.4% more likely to deviate positively from the risk assessment (i.e., predict higher levels of risk) and 21.5% less likely to deviate negatively from the risk assessment when evaluating black defendants.
One key takeaway: introducing risk assessments to the criminal justice system does not eliminate discretion to create “objective” judgments. Instead, risk assessments merely shift discretion to different places, which include the judge’s interpretation and use of the assessment.
Further research about how people interact with algorithms is needed. We introduce a new “algorithm-in-the-loop” framework that places algorithms into the sociotechnical context of improving human decisions rather than the technical context of making predictions in the abstract.
Of course, it remains an open question how closely our results resemble the outcomes of real-world implementation. But if risk assessments are to be used at all, they must be grounded in rigorous evaluations of their real-world impacts instead of in their theoretical potential.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Ben Green
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!