Excited for this final keynote! For those outside of the know, Julia Angwin was the journalist who broke the "Machine Bias" article with ProPublica that just about everyone in this field now cites. She also founded The Markup & is the EIC there. Her work has been field-changing.
@JuliaAngwin is talking about how The Markup does things differently, emphasizing building trust with the readers. By writing stories and showing their analysis work, but also through a privacy promise, not tracking *anything* about people who visit their website. No cookies!
@JuliaAngwin: "We don't participate in the game that is pretty common in Silicon Valley .... we don't think someone who gets paid to be a spokesperson for an organization deserves the cloak of anonymity. That's what we do differently from other journalists they might talk to."
@JuliaAngwin talks about their team that is "overinvested" in engineers -- 5 engineers and 8 reporters. Talks about the importance of the data and the replicability of their findings. They constantly ask domain experts to make sure if they've done their analyses correctly.
@JuliaAngwin, on the importance of their data & opensourcing it: "Data can help sway a debate. When we put out a new dataset, we do have the ability to help that conversation be more informed and help policy makers make better policy. That's our goal."
@JuliaAngwin is now talking about the Machine Bias article, and how immediately they saw a difference in how Black defendants had scores evenly distributed from 1-10, whereas White defendants had scores that skewed towards lower risk scores. That led to them asking what they did.
They didn't just stop there; they followed up with these risk scores by joining this data with what happened to those defendants after 2 years. Did they actually get re-arrested, as the scores predicted? And they realized Black defendants were more likely to be rated incorrectly.
She reflects on how this work spanned a year of effort for 4 people, working with the data & getting the defendants' stories. "It was an enormous amount of work. It was not the kind of work that newsrooms typically do, & it made me feel like I wanted to do more of that."
On different case studies: "Allstate has been arguing in almost every state that their algorithm is a trade secret. Our analysis was one of the first ways that regulators had a chance to see it because we got a hold of a filing they made."
They did an investigation into what emails Google puts into the promotions tab in your inbox, and it came down to signing up for tons of mailing lists, such as all the 2020 presidential candidates. Found that some folks' emails like Bernie's were always put into promotions.
Just lots of incredible case study after case study on how they looked for different ways to collect data and test algorithms and systems, which ranged all sorts of methods, not always automatable. And then would release their findings on GitHub for others to report on as well.
@JuliaAngwin mentions a tool they created to investigate algorithms called Blacklight, which pulls up a realtime scan of what privacy violations are happening on a website. They've since made the tool live for the public to be able to use.

themarkup.org/blacklight
And the other tool she mentions is Citizen Browser, which is a custom web browser they created to report on what content social media platforms were pushing to users. A lot of articles have come out of the use of this tool:

themarkup.org/citizen-browser
@JuliaAngwin: "There's no really good way for the world to know that FB is holding up its promises. We have invested in this extremely expensive & difficult sharing project, but I think it's had some really important fruits already."
@JuliaAngwin: "All data is political. The reason we spend so much time and energy and money collecting our own data because if you rely on data collected by others, you're buying into their political agenda."
@JuliaAngwin: "We're watchdogs. Our job is to hold institutions and the powerful accountable. In today's world, the way to do that is to collect our own data."
My Q: What about the limitations and biases in data?

"All data is terrible. I've never gotten a dataset that isn't horrible in some way. We have a section in our methodologies and often in the story itself that really addresses the limitations of our data."
"We don't see ourselves as having to be complete. We see ourselves as having to be really clear about what we do and what we don't know. I have incomplete data. It's still better than any other data available, so we just say that and we just own it."

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with local oakland enby @ #FAccT21 (pls mute me)

local oakland enby @ #FAccT21 (pls mute me) Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @WellsLucasSanto

10 Mar
On the last-minute changing of the name: "Rather than say the ways that we would like to deviate from the inevitable, we want to talk about the ways in which the implications of the future are up for grabs." - @alixtrot 🔥🔥
.@schock tells us to "put our money where our mouth is" and sign up for and support the Turkopticon organizing effort to help support Amazon Mechanical Turk workers:

.@cori_crider talks about Prop 22 here in CA, which companies like Uber spent $200M on in order to encode into law that drivers are not employees. "Having secured that victory, they're seeking to roll out that model in other legislatures." "That is Uber's vision of the future."
Read 22 tweets
10 Mar
Let's goooo!!! The second of two papers on AI education is coming up in a bit. As an AI educator focused on inclusion and co-generative pedagogy, I'm *really* excited for this talk on exclusionary pedagogy. Will tweet some take-aways in this thread:
First, a mention for those who don't know, I've been a CS educator since 2013, and in 2017 I moved into specifically being an AI educator, focusing on inclusive, accessible, and culturally responsive high school curriculum, pedagogy, and classroom experiences. Informs my POV
.@rajiinio starts the talk off by mentioning that there's an AI ethics crisis happening & we're seeing more coverage of the harms of AI deployments in the news. This paper asks the question, "Is CS education the answer to the AI ethics crisis, or actually part of the problem?" 🤔
Read 25 tweets
10 Mar
This is one of my favorite papers at #FAccT21 for sure, and I highly recommend folks watch the talk and read the paper if they can! Tons of nuggets of insight, was so busy taking notes that I couldn't live-tweet it. Here are some take-aways, though:
The paper looked at racial categories in computer vision, motivated by looking at some of the applications of computer vision today.

For instance, face recognition is deployed by law enforcement. One study found that these "mistook darker-skinned women for men 31% of the time."
They ask, how do we even classify people by race? If this is done just by looking at geographical region, Zaid Khan argues this is badly defined, as these regions are defined by colonial empires and "a long history of shifting imperial borders". 🔥🔥
Read 15 tweets
10 Mar
First paper of session 22 at #FAccT21 is on "Bias in Generative Art" with Ramya Srinivasan. Looks at AI systems that try to generate art based on specific historical artists' styles, but using causal methods, analyzes the biases that exist in the art generation.
They note: It's not just racial bias that emerges, but also bias that stereotypes the artists' styles (e.g., reduction of their styles to use of color) which doesn't reflect their true cognitive abilities. Can hinder cultural preservation and historical understanding.
Their study looks at AI models that generate art mainly in the style of Renaissance artists, with only one non-Western artist (Ukiyo-e) included. Why, you might ask?

There are "no established state-of-the-art models that study non-Western art other than Ukiyo-e"!!
Read 4 tweets
9 Mar
Happening now: the book launch of "Your Computer is on Fire", which is an anthology of essays on technology and inequity, marginalization, and bias.

@tsmullaney with opening remarks on how this *four and a half* year journey has been an incredibly personal one.
I can't believe it's been four years!! I remember attending the early Stanford conferences that led to the completion of this book. At the time I think I was just returning from NYC to Oakland... so much has changed since then, in the world & this field, truly.
@histoftech: "As Sarah Roberts (@ubiquity75 ) shows in her chapter in this book, the fiction that platforms that are our main arbiters of information are also somehow neutral has effectively destroyed the public commons"
Read 37 tweets
9 Mar
Last talk for this #FAccT21 session is "Towards Cross-Lingual Generalization of Translation Gender Bias" with Won Ik Cho, Jiwon Kim, Jaeyoung Yang, Nam Soo Kim.

Remember the Google translate case study that added sexist gender pronouns when translating? This is about that.
Languages like Turkish, Korean, Japanese, etc. use gender-neutral pronouns, but when translating to languages like English, often use gender-specific pronouns. But also, languages like Spanish and French, have gendered *expressions* as well to keep in mind.
This matters because existing translation systems could contain biases that could generate translated results that are offensive and stereotypical, and not always accurate.

Note that not all languages have colloquially used gender neutral pronouns (like the English "they").
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!