The computer science maxim "garbage in, garbage out," (#GIGO) dates back at least as far as 1957. It's an iron law of computing: no matter how powerful your data-processing system is, if you feed it low-quality data, you'll get low-quality conclusions.
1/
And of course, machine learning (AKA "AI") (ugh) does not repeal GIGO. Far from it. ML systems that operate on garbage data produce garbage predictive models, which produce garbage conclusions at vast scale, coated with a veneer of algorithmic objectivity facewash.
2/
The scale and credibility of ML-derived GIGO presents huge risks to our society in domains as varied as the credit system, criminal justice, hiring, education - even whether your kids will be taken away by Child Protective Services.
3/
To make this all worse, the vast data-sets used to train ML systems are in scarce supply, which leads multiple ML models to be trained on the same data, enshrining the defects of that data in all kinds of systems.
4/
One of the most significant training datasets is Imagenet, a collection of 14m labeled images that jumpstarted the ML revolution in 2012. As @willknight writes for @WIRED, Imagenet's labels came from low-waged, undersupervised workers.
Imagenet is one of the data-sets examined in new research from MIT's Curtis Northcutt and colleagues, who found that Imagenet and other comparable datasets have a typical error rate of about 6%.
TK
6/
This small margin of error has big consequences: first, because the errors aren't evenly distributed, and instead cluster around the kinds of biases that labelers have (for example, labeling images of woman medical professionals with "nurse" and men with "doctor").
7/
And second, because the incorrect labels obscure relative performance differences between models. When one model does better than another, you can't know if that's because it is a better model, or because it's less sensitive to incorrect labels.
8/
ETA - If you'd like an unrolled version of this thread to read or share, here's a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:
It's a zombie economy. For 40 years, we've eroded the wages of workers and transfered their share of profit and productivity to owners of capital. This is a problem, because people need money to buy things, and if they run out of money, they stop buying and profits vanish.
1/
Time and again, capitalism has kicked any reckoning over this down the road. First came the great liquidation: pension cashouts, raided savings, reverse mortgages. Then came consumer borrowing, a tidal wave of unrepayable debt.
2/
That's the zombie part: all the unpayable debt, which has been turned into bonds that enrich debt-holders. As Michael Hudson has told us again and again, debt that can't be paid, won't be paid. Our debt-based economy is the walking dead, a zombie.
3/
Ontario's drug-dealer premier is shockingly bad at distributing vaccines: How is it that Doug Ford was so good at slinging hash, and is so bad at vaccinating?
Ontario politics are a wild ride, but they rarely escape the province, or, at most, the nation. Which is weird, because Ontario has been a leading indicator of neoliberalism's cruelty, paranoia, and surrealism since (at least) the mid-nineties.
1/
Start with the 1995 election of Conservative Premier Mike Harris, a bland, dead-eyed sociopath whose "Common Sense Revolution" slashed Ontario's excellent public services and implemented a forced-labor program for poor people, AKA "workfare."
2/
Harris was a Romneyish sort of fellow: a personality-free, interchangeable suit who didn't raise anyone's pulse but excelled at administration. His major achievement was the amalgamation of Toronto: a forced merger of the City of Toronto with its heretofore separate suburbs.
3/