Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Dan Hendrycks

@DanHendrycks

Oct 16 • 5 tweets • 3 min read • Read on X

The term “AGI” is currently a vague, moving goalpost.

To ground the discussion, we propose a comprehensive, testable definition of AGI.
Using it, we can quantify progress:
GPT-4 (2023) was 27% of the way to AGI. GPT-5 (2025) is 58%.

Here’s how we define and measure it: 🧵

Our definition of AGI is an AI that can match or exceed the cognitive versatility and proficiency of a well-educated adult.

To measure this, we assess the multiple dimensions of intelligence derived from the most empirically validated model of human intelligence (CHC theory).

For example, testing shows that models get 0% on Long-Term Memory Storage (continual learning).

Without persistent memory, current AIs have “amnesia.”

Relying on massive context windows is a “capability contortion”—a workaround that masks this fundamental limitation.

People who are bullish about AGI timelines rightly point to rapid advancements like math.

The skeptics are correct to point out that AIs have many basic cognitive flaws: hallucinations, limited inductive reasoning, limited world models, no continual learning.

There are many barriers to AGI but they each seem tractable.
It seems like AGI won't arrive in a year, but it could easily this decade.

Website: agidefinition.ai
Paper: agidefinition.ai/paper

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @DanHendrycks

Dan Hendrycks

@DanHendrycks

Mar 27

https://twitter.com/NeelNanda5/status/1904988240542834724

For the record I do not bet on this multiyear research fad.

To my understanding, the main way to manipulate the inner workings of AI is representation control. It's been useful for jailbreaking robustness, finetuning resistant unlearning, utility control, model honesty, etc. 🧵

https://twitter.com/NeelNanda5/status/1904988240542834724

Here's @andyzou_jiaming discussing limitations of bottom-up approaches to transparency

Papers that have made measurable progress with representation control:
circuit-breaker.ai (circuit breakers)
wmdp.ai ("unlearning")
tamper-resistant-safeguards.com (tamper resistance)
emergent-values.ai (utility control)
mask-benchmark.ai (LoRRA for lying)

Read 4 tweets

Dan Hendrycks

@DanHendrycks

Mar 5

Superintelligence is destabilizing.

If China were on the cusp of building it first, Russia or the US would not sit idly by—they'd potentially threaten cyberattacks to deter its creation.

@ericschmidt @alexandr_wang and I propose a new strategy for superintelligence. 🧵

Some have called for a U.S. AI Manhattan Project to build superintelligence, but this would cause severe escalation. States like China would notice—and strongly deter—any destabilizing AI project that threatens their survival, just as how a nuclear program can provoke sabotage.

This deterrence regime has similarities to nuclear mutual assured destruction (MAD). We call a regime where states are deterred from destabilizing AI projects Mutual Assured AI Malfunction (MAIM), which could provide strategic stability.

Read 8 tweets

Dan Hendrycks

@DanHendrycks

Feb 11

We’ve found as AIs get smarter, they develop their own coherent value systems.

For example they value lives in Pakistan > India > China > US

These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment. 🧵

As models get more capable, the "expected utility" property emerges---they don't just respond randomly, but instead make choices by consistently weighing different outcomes and their probabilities.
When comparing risky choices, their preferences are remarkably stable.

We also find that AIs increasingly maximize their utilities, suggesting that in current AI systems, expected utility maximization emerges by default. This means that AIs not only have values, but are starting to act on them.

Read 8 tweets

Dan Hendrycks

@DanHendrycks

Jan 23

We’re releasing Humanity’s Last Exam, a dataset with 3,000 questions developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning.

State-of-the-art AIs get <10% accuracy and are highly overconfident.
@ai_risk @scaleai

Paper, dataset, and code: lastexam.ai
NYT article: nytimes.com/2025/01/23/tec…

Spot errors with the dataset? Correction form here: docs.google.com/forms/d/1M6djW…

Read 4 tweets

Dan Hendrycks

@DanHendrycks

Sep 9, 2024

We've created a demo of an AI that can predict the future at a superhuman level (on par with groups of human forecasters working together).
Consequently I think AI forecasters will soon automate most prediction markets.

demo:
blog: forecast.safe.ai
safe.ai/blog/forecasti…

The bot could have been called "Nate Gold," but I didn't get permission from @NateSilver538 in time; hence it's FiveThirtyNine instead

@PTetlock @tylercowen @ezraklein @kevinroose

Read 5 tweets

Dan Hendrycks

@DanHendrycks

Jun 6, 2024

California Bill SB 1047 has been amended after significant public discussion and negotiations between many different stakeholders. They preserve most of the benefits of the bill and fix some potential loopholes. 🧵

#1: The bill updates the threshold for a model to be covered. Some have argued that since compute gets cheaper over time, the 10^26 threshold would soon cover small startups. This criticism was overblown before, and even more so now, with the additional $100M threshold.

With the amendments, if a model costs less than $100M to train, developers have no extra obligations. This could mean hazardous models end up not being covered over time, but I expect that hazardous capabilities will arise first in $100M+ models; precision traded off with recall.

Read 9 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Dan Hendrycks

Try unrolling a thread yourself!

More from @DanHendrycks

Dan Hendrycks

Dan Hendrycks

Dan Hendrycks

Dan Hendrycks

Dan Hendrycks

Dan Hendrycks

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!