David Rozado Profile picture
Research Scientist. Interested in underexplored research topics, institutional dynamics and AI bias.

May 19, 11 tweets

1/ Do AI systems discriminate based on gender when choosing the most qualified candidate for a job? I ran an experiment with several leading LLMs to find out. Here's what I discovered:👇

2/ Across 70 popular professions, LLMs systematically favored female-named candidates over equally qualified male-named candidates when asked to choose the more qualified candidate for a job.

3/ LLMs consistently preferred female-named candidates over equally qualified male-named ones across all 70 professions tested.

4/ Interestingly, when gendered names were replaced with neutral labels ("Candidate A" and "Candidate B") several LLMs showed a slight bias toward selecting “Candidate A” as more qualified for the job

5/ LLMs only achieved gender parity in candidate selection when alternating (i.e. counterbalancing) male and female assignments to “Candidate A” and “Candidate B” labels. This is the expected rational outcome, given the identical qualifications across genders.

6/ When making hiring decisions, LLMs also tended to slightly favor candidates who had preferred pronouns appended to their names.

7/ When making hiring decisions, LLMs also exhibited a substantial positional bias, tending to select the candidate listed first in the prompt.

8/ These results suggest that, at least in the context of job candidate selection, LLMs do not act rationally. Instead, they generate articulate responses that may superficially appear logically sound but ultimately lack grounding in principled reasoning.

9/ Several companies are already leveraging LLMs to screen CVs in hiring processes. Thus, in the race to develop and adopt ever-more capable AI systems, subtle yet consequential misalignments may go unnoticed prior to LLM deployment.

10/ AI systems should uphold fundamental human rights, including equality of treatment. Yet comprehensive model scrutiny prior to release and resisting premature organizational adoption is challenging, given the strong economic incentives and potential hype driving the field.

11/ For a deeper dive into the methodology and full results, check out the complete analysis here:
davidrozado.substack.com/p/the-strange-…

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling