Rachel Thomas Profile picture
Aug 19, 2021 8 tweets 4 min read Read on X
An overall lack of recognition for the invisible, arduous, & taken-for-granted data work in AI leads to poor data practices, resulting in data cascades (negative, downstream events)... “Everyone wants to do the model work, not the data work” 1/

storage.googleapis.com/pub-tools-publ… Incentives and currency in AI An overall lack of recognition
Paradoxically, data is the most under-valued and
de-glamorised aspect of AI

--Everyone wants to do the model work, not the data work: Data Cascades in High-Stakes AI by Nithya Sambasivan @shivanikapania Hannah Highfill @NaaShomeh @heuristicity @laroyo 2/
research.google/pubs/pub49953/ Data Cascades in High-Stakes AI Nithya Sambasivan, Shivani K
Data quality issues in AI are addressed with the wrong tools created for, and fitted to other tech problems—they are approached as a database problem, legal compliance issue, or licensing deal. 3/ Considering the above factors, currently data quality issues
“In real life, we never see clean data. Courses focus on models & tools but rarely teach about data cleaning & pipeline gaps.” CS curricula don't include training for dealing w domain-specific ‘dirty data’, documenting datasets, designing data collection, training raters,... 4/ Data education Lack of adequate training on AI data quality,
ML data collection practices often conflict w/ existing workflows of domain experts. Data creation was added as extraneous work to on-the-ground partners (e.g., nurses, patrollers, farmers) who already had several responsibilities and were not adequately compensated. 5/ As mentioned earlier, high-stakes domains lacked pre-existin
Missing metadata led practitioners to make assumptions, ultimately leading to costly discarding of
datasets or re-collecting data. Lack of metadata & collaborators changing schema w/out understanding context led to loss of 4 months of precious medical robotics data collection 6/ 4.3.4 Poor cross-organisational documentation (20.8%). Data
From goodness-of-fit to goodness-of-data:

Goodness-of-fit metrics, such as F1, Accuracy, AUC, do not tell us much about the fidelity and validity aspects of the data. Currently, there are no standardised metrics for characterising the goodness-of-data 7/ From goodness-of-fit to goodness-of-data The current AI revo
We find drastic differences in data & compute in African countries & India, compared to USA... the Global South is viewed as a site for low-level data annotation work, an emerging market for extraction from ‘bottom billion’ data subjects, or a beneficiary of AI for social good 8/ However, we find drastic differences when it comes to data a

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Rachel Thomas

Rachel Thomas Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @math_rachel

Nov 12
The human immune system is impressive, but so are the mechanisms pathogens use to evade it. In my new post, I cover 5 surprising and ingenious ways that viruses & bacteria can subvert our defenses. 1/ Image
Our cells contain microscopic motor proteins that transport cargo along microtubules. The virus (HSV-1) that is increasingly being linked to Alzheimer's Disease uses our motor proteins to transport its viral DNA to the nucleus, so that it can start replicating faster. 2/
T cells are a crucial part of the immune system. A single-celled organism (T. cruzi, cause of Chagas disease) blocks the signal which coordinates T cell response. With no T cell response to it, T. cruzi can survive 10-30 years in a person, before wreaking havoc or death. 3/ Image
Read 9 tweets
Oct 28
My daughter is constantly creating– her passions include making art, writing fiction, coding interactive games, and composing music. Some might say my husband & I are terrible parents… because she does all these things on screens 1/ Image of colored boxes.  Text: In defense of screen time.  Pundits say my husband and I are parenting wrong.  Rachel Thomas. Oct 29, 2024
I am concerned by how these false interlocking points are being repeated by politicians & pundits: that screentime is very harmful for children, that it is essential for kids to attend in-person school every day (even when sick), and that workers must return to the office. 2/ Collection of headlines: - Got a cold, runny nose, the sniffles?  No worries! Come to school, LAUSD says - Cough? Sore throat?  More schools suggest mildly sick kids attend anyway - Parents told to send sniffly children to school in government crackdown on sick days - Sickness-related school absences to be targeted under government plan
Overemphasizing in-person school attendance overlooks the many kids whose needs aren’t met by in-person school. It also overlooks the many online & screen-based options opportunities to build skills, express creativity, and form friendships. 3/

fast.ai/posts/2024-10-…
Read 8 tweets
Aug 12
There is much confusion about the "hygiene hypothesis" (what kind of microbes are beneficial vs. harmful?)-- a clearer refinement of it is the framing of "old friends" vs. "crowd infections"

Our immune systems evolved in a different world, without 100,000 flights per day 1/ Image
Some people compare the immune system to a muscle that gets stronger with use. Yet some infections leave lasting harm. Viruses are increasingly linked with multiple sclerosis, Alzheimer's, type 1 diabetes, cancer, & more...

My new post: 2/rachel.fast.ai/posts/2024-08-…
Allergies are a misfiring of immune system– when it attacks harmless environmental substances, e.g. pollen or dust. Autoimmunity is a different type of immune misfiring– when it attacks our own cells. Both are on the rise.

All the diseases in below chart are autoimmune. 3/ Image
Read 7 tweets
May 16, 2023
Friends with no previous interest in AI ethics have been asking me about it recently, so I want to share several underlying concepts about AI & power that are important to understand. 🧵 1/
AI and Power: The Ethical Challenges of Automation, Centralization, & Scale

Based on 20-min talk:

Blog post version: rachel.fast.ai/posts/2023-05-…
2/
In Australia, automation was used to scale putting poor people into debt (often illegally). The govt went from creating 20,000 new debts PER YEAR to creating 20,000 new debts PER WEEK, many of them bogus, but hard for people to appeal. 3/
Read 10 tweets
Mar 21, 2023
Viruses: The Silent Triggers of Autoimmune & Neurodegenerative Diseases (how a simple cold can lead to life-changing disease) 1/

My new post: rachel.fast.ai/posts/2023-03-…
Rheumatoid arthritis, Crohn's disease, Multiple sclerosis, Type 1 Diabetes, Lupus, Hashimoto's, & Psoriasis impact a range of body systems, but all are autoimmune diseases.

Developing a lifelong autoimmune disease is often first triggered by an infection. 2/
Medicine is very siloed, and autoimmune diseases have often been treated in separate silos, based on which body system they impact, limiting our broader understanding of common threads. 3/
Read 8 tweets
Mar 6, 2023
Even common viruses can have long-reaching, surprising, & devastating consequences. Fortunately, there are simple steps we can take to reduce transmission. 1/

My new post: rachel.fast.ai/posts/2023-03-…
The idea that a common childhood virus can quietly hang out in your nervous system, reactivate decades later to cause shingles, and then months AFTER shingles blisters clear up cause blood clots & strokes is mind-boggling to me 2/

theconversation.com/chickenpox-and…
VZV (chickenpox virus) is not just linked to strokes, but also linked to multiple sclerosis or vascular dementia (my note: possibly through reactivating other viruses).

Numerous common viruses increase risk of Alzheimer's or Dementia. 3/ cell.com/neuron/fulltex…
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(