Today, technical experts hold the tools to conduct system-scale algorithm audits, so they largely decide what algorithmic harms are surfaced. Our #cscw2022 paper asks: how could *everyday users* explore where a system disagrees with their perspectives? hci.st/end-user-audit 🧵
(2/6) User-led audits at this scale are challenging: just to get started, they require substantial user effort to label and make sense of thousands of system outputs. Could users label just 20 examples and jump to the valuable part of providing their unique perspectives?
(3/6) Our IndieLabel auditing tool allows users to do exactly this. Leveraging varied annotator perspectives *already present* in ML datasets, we use collaborative filtering to help users go from 20 labels to 100k so they can explore where they diverge from the system’s outputs.
(4/6) We had 17 non-technical users lead end-user audits of Perspective API. With just 30 mins to conduct their audits, participants independently replicated issues that expert audits had identified and raised previously underreported issues like over-flagging of reclaimed slurs.
(5/6) What does this mean for the future of algorithm audits? With End-User Audits, we uncovered important, overlooked issues by enabling individual users to lead audits. We need more methods that go beyond monolithic demographic groups and amplify the voices of everyday users.