Profile picture
Thomas G. Dietterich @tdietterich
, 12 tweets, 3 min read Read on Twitter
.@katecrawford makes many very important points. Here is my attempt to translate some of them into engineering terminology. 1/
As ML researchers and practitioners, we tend to formulate the problem as learning a mapping from inputs X to outputs Y. Crawford's point about bias in the data is that we aren't observing the real inputs or the real outputs 2/
For example, in crime prediction, X is not a sample from the population but instead from the subset of the population investigated by the police. 3/
Y is not the true outcome (e.g., did the person commit a crime), but the conclusion of the legal system. 4/
So a model predicting Y from X is predicting how the legal system will treat a person X that they have chosen to investigate. It is not predicting whether a person X' drawn from the general population is guilty (Y'). 5/
To model X'->Y', we need to model the sampling bias of X and the legal bias of Y. We have tools to help with this, so in some sense there *is* an "engineering solution", but it requires good models. The experts on those kinds of models are social scientists 6/
Now let's consider optimizing a policy to achieve some goal. One interpretation of @katecrawford 's remarks on representational harm is that we need to think carefully about the optimization objective. 7/
In building a system for Extreme Vetting, for example, the proximal "customer" is the government, and we tend to focus on the costs to the government of false positive and false negative errors. 8/
But she is telling us to think more broadly about the other "customers", namely the people being vetted and the broader society. We must consider the short- and long-term costs of the vetting process and the false positive and false negative errors on these customers, too. 9/
Modeling those is very challenging, and again, the people with the most expertise are social scientists. The indignity of being "vetted", for example, is itself a cost. 10/
AI tools work by optimizing an objective. If the objective is wrong, the resulting system will be wrong. As with all software engineering efforts, it is important to engage with all of the stakeholders to ensure that we have the correct specifications before optimizing. 11/
Formulating the right specifications is the joint responsibility of the software engineers and the stakeholders. Computer scientists can't expect that other people will do this for us nor that we can do it alone. end/
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Thomas G. Dietterich
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($3.00/month or $30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!