uncorrelated Profile picture
Nov 19 11 tweets 5 min read Read on X
White by Default?

The viral posts were right.

We scraped 5.5 million criminal records and 1.5 million mugshots from 39 states.

29% of Hispanics are being misclassified as White in official Department of Corrections databases.

Even when Hispanic is explicitly classified 🧵 Image
Everyone's seen these collages claiming non-whites get classified as White in criminal databases.

The problem? Anecdotal. Cherry-picked. No way to verify.

We had 1.5 million mugshots, names, and official racial classifications.

Time to test it systematically: Image
We trained a multinomial logistic regression on 18 features:
• DeepFace racial probabilities from mugshots
• Census name demographics
• First and last name racial statistics

92.76% accuracy distinguishing Black, White and Hispanic. Image
The key insight: A sufficiently accurate linear model trained on biased data learns the TRUE signal, not the bias.

Systematic deviations between predictions and official labels indicate mislabeling by authorities, not model error.

Here's what we found.... Image
29% of predicted Hispanics were officially classified as White.

Even at 95-100% model confidence, 22.4% of predicted Hispanics were still assigned White.

Median confidence for these cases? 91.7%. Image
Image
Visual inspection confirmed it.

These are people classified as "White" in official records. Look at those names! Image
Furthermore, PC mapping revealed that many "Whites" were in Hispanic variable zones, but not the other way around. Measuring the euclidean distance from the centroids, Whites were just as distinguishable from Hispanics as Blacks were from Whites. Image
Image
To further confirm the validity of our method through visual inspection, we contrasted low and high confidence classifications. High confidence misclassifications almost always were the predicted race instead of the assigned race. Image
Image
We corrected for misclassification:

Hispanic criminal record rates increase 20-31%
White rates decrease 4-6%
Black rates decrease 1%

The lowerbound being only high confidence reassignments (>90% confidence), the upperbound assuming all predicted = actual race. Image
State-level analysis showed massive variation.

Florida: 60%+ of Hispanics misclassified as White (Cubans tend to self-id as White?)

But no correlation with political ideology (r = 0.21, p = 0.472). Image
For the analysis, FULL REPLICATION, code, data, github check out my blog post on it:

White by Default: Systematic Bias in U.S. Criminal Racial Assignment

uncorrelated.xyz/p/white-by-def…Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with uncorrelated

uncorrelated Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(