CLS Profile picture
CLS
PhDing @stanfordnlp | teaching language models to do research
Jan 22 9 tweets 3 min read
Can LLMs automate frontier LLM research, like pre-training and post-training?

In our new paper, LLMs found post-training methods that beat GRPO (69.4% vs 48.0%), and pre-training recipes faster than nanoGPT (19.7 minutes vs 35.9 minutes).

1/ Image Paper:

To make the scope clear, our automated AI researchers aren’t just tuning hyper-parameters; they are often experimenting with meaningful algorithmic ideas shown below.

2/ arxiv.org/abs/2601.14525Image
Image
Sep 9, 2024 14 tweets 6 min read
Automating AI research is exciting! But can LLMs actually produce novel, expert-level research ideas?

After a year-long study, we obtained the first statistically significant conclusion: LLM-generated ideas are more novel than ideas written by expert human researchers.Image In our new paper:

We recruited 49 expert NLP researchers to write novel ideas on 7 NLP topics.

We built an LLM agent to generate research ideas on the same 7 topics.

After that, we recruited 79 experts to blindly review all the human and LLM ideas.

2/ arxiv.org/abs/2409.04109
Image
May 28, 2024 11 tweets 6 min read
One powerful / scary application of LLMs is using them to persuade humans, for good or bad.

I read through recent empirical human studies on LLM persuasion, and here’s a thread summarizing my favorite ones:

(google doc version: )

1/ 🧵docs.google.com/document/d/1il… I’ll summarize each paper along a few key dimensions:

- Persuasion topics
- Interaction format
- Measurement of persuasiveness
- Main findings

Let’s start with a few studies on measuring the persuasiveness of LLMs.

2/