Forecasting Research Institute Profile picture
Research institute focused on developing forecasting methods to improve decision-making on high-stakes issues, co-founded by chief scientist Philip Tetlock.
Jan 8 6 tweets 3 min read
🏆 In October, we invited external teams to submit to ForecastBench, our AI forecasting benchmark.

The challenge? Beat superforecasters—using any tools available (scaffolding, ensembling, etc).

The result? External submissions are now the most accurate models on our leaderboard—though superforecasters still hold #1.

@xai's model (grok-4-fast) is the leading external submission, at #2.

One of Cassi's entries takes the #3 spot

Here's what changed. 🧵Image In October, we opened up ForecastBench’s tournament leaderboard to external submissions. Teams are free to use any tools they choose.

Several teams responded, including @xai, Cassi, @fractalai, @lightningrodai, and @_Mantic_AI. Thanks to all of them for participating on this challenging benchmark.

Models from @xai and Cassi outperformed all our baseline LLM configurations.Image
Nov 10, 2025 14 tweets 8 min read
Today, we are launching the most rigorous ongoing source of expert forecasts on the future of AI: the Longitudinal Expert AI Panel (LEAP).

We’ve assembled a panel of 339 top experts across computer science, AI industry, economics, and AI policy.

Roughly every month—for the next three years—they’ll provide precise, falsifiable forecasts on the trajectory of AI capabilities, adoption, and impact.

Our results cover where experts predict major effects of AI, where they expect less progress than AI industry leaders, and where they disagree.

LEAP experts forecast major effects of AI by 2030, including:

⚡ 7x increase in AI’s share of U.S. electricity use (1% -> 7%)
🖥️ 9x increase in AI-assisted work hours (2% -> 18%)

By 2040, experts predict:
👥30% of adults will use AI for companionship daily
🏆60% chance that AI will solve or substantially assist in solving a Millennium Prize Problem
🚂32% chance that AI will have been at least as impactful as a "technology of the millennium," like the printing press or the Industrial Revolution.

🧵Read on for more insights and resultsImage
Image
Our LEAP panel is made up of the following experts:

🧑‍🔬 76 Top computer scientists (e.g., professors from top-20 universities)
🤖 76 AI industry experts (from frontier model and other leading AI companies)
💲 68 Leading economists (including many studying economic growth or technology at top universities)
🧠 119 Policy and think tank experts
🏆 12 Honorees from TIME’s 100 most influential people in AI, in 2023 and 2024

(Plus 60 highly accurate superforecasters and 1,400 members of the U.S. public)

For more details on our sample, see the full reports linked below.
Oct 8, 2025 10 tweets 4 min read
Is AI on track to match top human forecasters at predicting the future?

Today, FRI is releasing an update to ForecastBench—our benchmark that tracks how accurate LLMs are at forecasting real-world events.

A trend extrapolation of our results suggests LLMs will reach superforecaster-level forecasting performance around a year from now.

Here’s what you need to know: 🧵Image Why LLM forecasting accuracy is a useful benchmark:

🧠Forecasting requires collecting and synthesizing data, causal reasoning, and probabilistic thinking, making it a good test of reasoning

💼Forecasting has high practical value

🔮Future events aren’t in training data, making the benchmark hard to game

@elonmusk: “The ability to predict the future is the best measure of intelligence”

x.com/elonmusk/statu…
Sep 2, 2025 10 tweets 4 min read
We now have the first accuracy results from the largest-ever existential risk forecasting tournament.

In 2022, we convened 80 experts and 89 superforecasters for the Existential Risk Persuasion Tournament (XPT), which collected thousands of forecasts in 172 questions across short-, medium- and long-term time horizons.

We now have answers for 38 short-run questions covering AI progress, climate technology, bioweapons, nuclear weapons and more.

Here’s what we found out: 🧵Image Respondents—especially superforecasters—underestimated AI progress.

Participants predicted the state-of-the-art accuracy of ML models on the MATH, MMLU, and QuaLITY benchmarks by June 2025.

Domain experts assigned probabilities of 21.4%, 25%, and 43.5% to the achieved outcomes.

Superforecasters assigned even lower probabilities: just 9.3%, 7.2%, and 20.1% respectively.Image
Oct 1, 2024 18 tweets 4 min read
Today, we're excited to announce ForecastBench: a new benchmark for evaluating AI and human forecasting capabilities. Our research indicates that AI remains worse at forecasting than expert forecasters. 🧵

Arxiv:
Website: arxiv.org/abs/2409.19839
forecastbench.org Evaluating LLM forecasting ability is tricky! Prior work asks models about events that already have (or have not) occurred, risking contamination of training data.
Our solution is to use questions about future events, the outcomes of which are unknowable when forecasts are made.