Jacy Reese Anthis Profile picture
Humanity will soon coexist with a new class of beings. I research the rise of these "digital minds." ML/HCI/sociology/stats @SentienceInst @UChicago @Stanford
Jun 6 9 tweets 2 min read
Many speak idyllically about a world with "fair" or "unbiased" LLMs, but is that even possible? In our new preprint, we take the most well-defined principle of AI safety/ethics and show, in reality, an LLM could never be fair under any definition in the current ML literature. 👇 Image @KLdivergence @alexdamour @ChenhaoTan With ML models like for sentencing criminals or hiring job applicants, you might impose a constraint like "fairness through unawareness" (e.g., your model doesn't take race/gender as input), but not with LLMs or any general-purpose model built on unstructured data. (Section 3.1)
Oct 31, 2023 4 tweets 2 min read
It seems a fair AI would treat you the same if you had a different race/gender/disability/etc., but how can we ever test counterfactual fairness? In #NeurIPS2023 w @victorveitch, we show you sometimes can with simple, observed metrics like group parity! 🧵 arxiv.org/abs/2310.19691 First, many say fairness must come at a cost of accuracy—an inevitable trade-off—but we rebut with a new motivation for counterfactual fairness. In plausible "causal contexts," CF is actually optimal in terms of accuracy in an aspirational unbiased target domain. We can get both! LaTeX:  Let $\mathcal{F}^{\textrm{CF}}$ be the set of all counterfactually fair predictors. Let $\ell$ be a proper scoring rule (e.g., square error, cross entropy loss). Let the counterfactually fair predictor that minimizes risk on the training distribution $X,Y,A \sim P$ be:         $$             f^*(X) := \underset{f \in \mathcal{F}^{\textrm{CF}}}{\operatorname{argmin}} \ \mathbb{E}_{P}[\ell(f(X), Y)]         $$         Then, $f^*$ also minimizes risk on the target distribution $X,Y,A \sim Q$ with no selection effects, i.e.,         $$             f^*(X) = \underset{f}{\operatorname{arg...
May 8, 2023 6 tweets 4 min read
I discussed digital minds, AI rights, and mesa-optimizers with @AnnieLowrey at @TheAtlantic. Humanity's treatment of animals does not bode well for how AIs will treat us or how we will treat sentient AIs. We must move forward with caution and humility. 🧵 theatlantic.com/ideas/archive/… @AnnieLowrey @TheAtlantic In our 2021 AIMS survey, we found:

- The average US adult thinks AIs be sentient in 10 years.
- 18% think some AIs are already sentient.
- 58% support a ban on developing sentient AI.
- 75% think sentient AIs deserve to be treated with respect. sentienceinstitute.org/aims-survey-20…
Mar 25, 2022 12 tweets 3 min read
My new paper introduces semanticism, a theory of consciousness that solves the 'hard problem' by showing that consciousness and qualia do not exist as often assumed. To my knowledge, it is the most precise statement of eliminativism or illusionism to date. link.springer.com/content/pdf/10… Image I distinguish two analytically distinct usages of 'consciousness':

- a weak version, consciousness-as-self-reference (e.g. 'I think, therefore I am')
- a strong version, consciousness-as-property (e.g. a question like 'Is this AI conscious?' that extend across multiple entities)
Feb 13, 2022 10 tweets 11 min read
I'm glad to see leading voices in AI like @sama, @ylecun, and @ilyasut ponder the question of artificial consciousness. We research this @SentienceInst because it's one of the most important questions for the long-term future. I'd like to introduce the topic in a brief thread 1/n First, the meaning of consciousness is deeply contested among philosophers and scientists. There are 3+ more precise terms:

- thought, typically a linguistic stream of words
- perception, either through the 5 senses or imagination
- sentience, positive and negative emotions 2/n
Jun 15, 2020 15 tweets 6 min read
Humans hunt the largest animals and drive them to extinction. We have always done so. The first homo sapiens lived with some amazing larger-than-life megafauna that may still be alive today if we weren’t such a murderous species. An appreciation thread for these real-life titans: Megatherium: 6m (20ft) giant sloth weighing up to 4 tons. Like modern sloths, they slowly lumbered through the jungle. They could probably walk on two legs. Fossils have cut marks, indicating overlap with early humans. Extinct around 8,500 BC (first homo sapiens ~300,000 BC).
Mar 9, 2020 11 tweets 4 min read
Most people have seen the scary math on how unprepared humanity is for #COVID2019, particularly US medical infrastructure (e.g. masks, hospital beds, HT @LizSpecht). But I don’t think most people realize that this crisis, and S&P 500 drop, was entirely foreseeable. A thread. 1/11 The first layer of the onion is that we knew a global pandemic was coming. For just one example, in a 2015 discussion with @EzraKlein, @BillGates said a deadly flu-like pandemic is the most predictable disaster in the history of the human race (vox.com/2015/5/27/8660…). 2/11