Thread of some posts about diversity & inclusion I've written over the years. I still stand behind these.
(I'm resharing bc a few folks are suggesting Jeremy's CoC experience ➡️ partially our fault for promoting diversity, we should change our values, etc. Nope!)
1/
Math & CS have been my focus since high school/the late 90s, yet the sexism & toxicity of the tech industry drove me to quit. I’m not alone. 40% of women working in tech leave. (2015)
The primary reason women leave the tech industry is because they are treated unfairly, underpaid, less likely to be fast-tracked, and not given a fair chance to advance. (2016)
Tech Interviews are Terrible. One key problem is that people primarily like to hire people like themselves. Some research, case studies, & guidelines. (2017)
The tech industry’s glorification (and often requirement) of long hours means that many people with chronic illnesses & disabilities are unable to work at many tech companies. (2019) medium.com/s/story/techs-… 8/
• • •
Missing some Tweet in this thread? You can try to
force a refresh
new free online course: Practical Data Ethics, from fast ai & @DataInstituteSF covering disinformation, bias, ethical foundations, privacy & surveillance, silicon valley ecosystem, and algorithmic colonialism
As @cfiesler showed w spreadsheet of >250 tech ethics syllabi & her accompanying meta-analysis, tech ethics is a sprawling subject. No single course can cover everything. And there are so many great courses out there!
I spent a lot of time trying to cut my assigned reading list down to a reasonable length, as there are so many fantastic articles & papers on these topics. The following list is not at all exhaustive.
Another form of measurement bias is when there is systematic error, such as how pulse oximeters (a crucial tool in treating covid) and fitbit heart rate monitors (used in 300 clinical trials) are less accurate on people of color 3/
Structural racism can be combated only if there is political will, not more data. Ending racism has to begin and end with political will. Data, while helpful in guiding policy focus, are not a shortcut to creating this will.
Data are not merely recorded or collected, they are produced. Data extraction infrastructures comprise multiple points of subjectivity: design, collection, analysis, interpretation and dissemination. All of these open the door to exploitation. 2/
In South Korea, digital COVID tracking has exacerbated hostility towards LGBTQ people.
When UK researchers set out to collect better data on Roma migrants to assess social needs, missteps in data presentation gave rise to political outcry over an "influx" of migrants. 3/
Things to know if you work on medical ML:
- Medical data can be incomplete, incorrect, missing, & biased
- Medical system is disempowering & often traumatic for patients
- Crucial to involve patients & to recognize risk of how ML can end up further disempowering 1/
On bias in medicine (& thus medical data): research shows that the pain of women is taken less seriously than pain of men. The pain of people of color is taken less seriously than pain of white people.
Result: longer time delays, lower quality of care, & worse outcomes 2/
A meta-analysis of 20 years of published research found that Black patients were 22% less likely than whites to get *any* pain medication and 29% less likely to be treated with opioids 3/
Important work is happening in *Participatory ML* and in recognizing that AI ethics is about the *distribution of power*. I want to create a thread linking to some of this work 1/
Datasets (particularly benchmarks) are infrastructure: a foundation for other tools & tech, tending to seep into the background, shaped by specific aims, seeming natural from one perspective but jarring from another @cephaloponderer@alexhanna@amironesei arxiv.org/abs/2007.07399
Focusing on *transparency* of a ML system without plausible actions of being able to change aspects of that system
are a Pyrrhic victory. *Contestability*, however, allows us
to critically engage within the system.
Let's move beyond insufficient training data as
the sole "solution" to discriminatory outcomes.
Gathering
more training data from populations which are already extensively surveilled ignores how data-gathering operations can serve as another
form of "predatory inclusion"