xuan (ɕɥɛn / sh-yen)'s Threads

Apr 28 • 6 tweets • 1 min read

my least favorite thing about RL theory is that it has polluted our understanding of human agency with what is at best a theory of biological agency or brain function it may well be that there are RL systems in the brain that implement human motivation and trial-and-error learning

Dec 21, 2024 • 10 tweets • 2 min read

pretty upset about o3's existence tbh around mid 2022 I began worrying about MuZero-style architectures & tool-augmented LMs as the main potential source of classical AI risks from strong optimization/planning

Sep 3, 2024 • 19 tweets • 6 min read

Should AI be aligned with human preferences, rewards, or utility functions?

Excited to finally share a preprint that @MicahCarroll @FranklinMatija @hal_ashton & I have worked on for almost 2 years, arguing that AI alignment has to move beyond the preference-reward-utility nexus!

This paper () is at once a critical review & research agenda.

In it we characterize the role of preferences in AI alignment in terms of 4 preferentist theses. We then highlight their limitations, arguing for alternatives that are ripe for further research. arxiv.org/abs/2408.16984

Mar 1, 2024 • 16 tweets • 5 min read

How can we build AI assistants that *reliably* follow our instructions, even when they're ambiguous?

@Lance_Ying42 & I introduce CLIPS: A Bayesian arch. combining inverse planning w LLMs that *pragmatically* infers human goals from actions & language, then provides assistance!

Imagine you’re in the kitchen with a friend, who places 3 plates on the table then says: “Could you get the forks and knives?” How many should you get?

Intuitively the answer is 3, because you can infer from your friend’s actions that they want to set the table for three people!

Nov 21, 2023 • 5 tweets • 2 min read

Pretty good explanation of why one might be skeptical (like I am) of transformer-based LLM scaling:

Single forward pass def. can't express most complicated algorithms.

Autoregressive generation can express much more, but learning will encourage non-generalizable shortcuts.

https://twitter.com/ericjmichaud_/status/1727069268573892661

Even for very simple algorithms like addition or comparison, it seems to me like transformer LLMs are learning *multiple* circuits to solve the same problem, depending on what exact prompt it gets (got this intuition from the experiments in ) arxiv.org/abs/2305.08809

Mar 26, 2023 • 16 tweets • 6 min read

LLMs *are* just predicting the next word at run time (ruling out beam search etc.)

It's just that predicting the next word isn't inconsistent with doing more complicated stuff under the hood (e.g. Bayesian inference over latent structure). Please read de Finetti's theorem y'all!

https://twitter.com/ryancbriggs/status/1639676599640510466

The original theorem:

en.m.wikipedia.org/wiki/De_Finett…

Sep 14, 2021 • 17 tweets • 5 min read

~11 reasons why I transitioned~

YMMV but I much prefer how it sucks to be a woman over how it sucks to be a man

https://twitter.com/RDembroff/status/1437408111338340352

A lot of my experience is of course tremendously improved by the fact that I have financial security, am accepted by my family and workplace, and am nowadays typically read as a (cis) woman.

Feb 24, 2021 • 7 tweets • 1 min read

it's 2021 and algorithms lecturers are still teaching the stable marriage problem as if its not heteronormative and alienating af to LGBTQ students 🙃🙃🙃 some suggestions:
- call it the stable matching problem
- use less fraught social analogies, like matching schools to candidates

Dec 13, 2020 • 15 tweets • 13 min read

What I've been doing this week instead of research: Fighting MIT's ridiculous, inhumane decision to stop funding overseas students unless they return by Jan 30 to the US. IN THE MIDDLE OF A PANDEMIC.

We sent an open letter (450+ signatures) in response: tinyurl.com/mit-overseas-f…

MIT's explanation? For the Jan 30 deadline, their interpretation of the 5-month absence rule for overseas students. EXCEPT that the rule is reported to be suspended:

https://twitter.com/karinfischer/status/1337117356389122049

As this student points out, MIT could do much better:
#PayYourStudents #StopRiskingLives

Share this page!

Enter URL or ID to Unroll